| ▲ | rad_val 9 days ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||
The strongest argument for this is structural: what LLMs are. In a brutal simplistic way: each token is represented in a high dimensional vector. LLMs operate on them. They are the true, underlying meaning of the token for the LLM. Think of it as 1000+ ways to think of that word/token. Those meanings are baked in at training time. So, LLMs might be able to cross-reference them and solve a class of problems that flew under our radar, but can't come up with revolutionary theories that were never in the training set. Of course, they will help winning a Nobel in the years to come, no doubt, but can't speak mathematics we can't understand (beyond simple obfuscation) and won't discover anything substantial on their own. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | resident423 9 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
> but can't come up with revolutionary theories that were never in the training set. Can you elaborate? I don't think the solution to the unit distance problem was in the training set, but I'm guessing you mean there's some higher bar for revolutionary theories LLMs cant reach? If so where do you expect the limit will be? | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | redox99 9 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Instead of going into a long technical argument of why your description of LLMs is flawed, I'll go straight to the point, because people keep moving the goal posts. What exact problem would need to be solved by LLMs to convince you that they DO discover novel solutions? | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | int_19h 9 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
I don't see how any of this follow. Yes, the LLMs will learn the "meaning" (here narrowly defined as relative configuration in the embedding space) of vectors that correspond to tokens in whatever tokenizer is used to feed into them. But that vector space is not discrete, and nothing precludes the model from internally operating on other vectors that it never saw in training, based on how they relate to those vectors which it did see. | |||||||||||||||||||||||||||||||||||||||||||||||||||||