| ▲ | throw310822 11 days ago | |||||||||||||||||||||||||
> There are grammar rules And they're made out of weights. | ||||||||||||||||||||||||||
| ▲ | noosphr 11 days ago | parent [-] | |||||||||||||||||||||||||
As opposed to integers in normal programming. The 'magic' in weights is that the rules are spread through the whole model and you can't point to one place which encodes them. The grokking paper shows that this stops being the case with enough training data and enough compute. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||