Remix.run Logo
riku_iki 4 days ago

> And as much as people like to handwave about how mysterious the weights are we actually perfectly understand both how the weights arise and how they result in the model's outputs

we understand on this low level, but LLMs through the training converge to something larger than weights, there is a structure of these weights which emerged and allow to perform functions, and this part we do not understand, we just observe it as a black box, and experimenting on the level: we put this kind of input to black box and receive this kind of output.