▲ | didibus 2 days ago | |
Yes exactly. The text of human natural language that it is trained on encodes the solutions to many problems as well as a lot of ground truths. The way I think of it is. First you have a random text generator. This generative "model" in theory can find the solution to all problems that text can describe. If you had a way to assert if it found the correct solution, you could run it and eventually it would generate the text that describes the working solution. Obviously inefficient and not practical. What if you made it so it skipped generating all text that aren't valid sensical English? Well now it would find the correct solution in way less iterations, but still too slow. What if it generated only text that made sense to follow the context of the question? Now you might start to see it 100-shot, 10-shot, maybe even 1-shot some problems. What if you tuned that to the max? Well you get our current crop of LLMs. What else can you do to make it better? Tune the dataset, remove text that describe wrong answers to prior context so it learns not to generate those. Add more quality answers to prior context, add more problems/solutions, etc. Instead of generating the answer to a mathematical equation the above way, generate the Python code to run to get the answer. Instead of generating the answer to questions about current real world events/facts (like the weather). Have it generate the web search query to find it. If you're asking a more complex question, instead of generating the answer directly, have it generate smaller logical steps towards the answer. Etc. |