| ▲ | irishcoffee 2 hours ago | ||||||||||||||||
So you do agree that an LLM cannot derive math from first principals, or no? If an LLM had only ever seen 1+1=2 and that was the only math they were ever exposed to, along with the numbers 0-10, could an LLM figure out that 2+2=4? I argue absolutely not. That would be a fascinating experiment. Hell, train it on every 2-number addition combination of m+n where m and n can be any number between 1-100 (or 0-100 would be better) BUT 2, and have it figure out what 2+2 is. I would probably change my opinion about “circuits”, which by the way really stretches the idea of a circuit. The “circuit” is just the statistically most likely series of tokens that you’re drawing pretend lines between. Sure, technically connect-the-dots is a circuit, but not in the way you’re implying, or that paper. | |||||||||||||||||
| ▲ | soulofmischief 2 hours ago | parent [-] | ||||||||||||||||
> If an LLM had only ever seen 1+1=2 and that was the only math they were ever exposed to, along with the numbers 0-10, could an LLM figure out that 2+2=4? What? Of course not? Could you? Do you understand just how much work has gone into proving that 1 + 1 = 2? Centuries upon centuries of work, reformulating all of mathematics several times in the process. > Hell, train it on every 2-number addition combination of m+n where m and n can be any number between 1-100 (or 0-100 would be better) BUT 2, and have it figure out what 2+2 is. If you read the paper I linked, it shows how a constrained modular addition is grokked by the model. Give it a read. > The “circuit” is just the statistically most likely series of tokens that you’re drawing pretend lines between. That is not what ML researchers mean when they say circuit, no. Circuits are features within the weights. It's understandable that you'd be confused if you do not have the right prior knowledge. Your inquiries are good, but they should stop as inquiries. If you wish to push them to claims, you first need to understand the space better, understand what modern research does and doesn't show, and turn your hypotheses into testable experiments, collect and publish the results. Or wait for someone else to do it. But the scientific community doesn't accept unfounded conjecture, especially from someone who is not caught up with the literature. | |||||||||||||||||
| |||||||||||||||||