Remix.run Logo
libraryofbabel 5 days ago

* You’re right that a lot of people take a cursory look at the math (or someone else’s digest of it) and their takeaway is “aha, LLMs are just stochastic parrots blindly predicting the next word. It’s all a trick!”

* So we find ourselves over and over again explaining that that might have been true once, but now there are (imperfect, messy, weird) models of large parts of the world inside that neutral network.

* At the same time, the vector embedding math is still useful to learn if you want to get into LLMs. It’s just that the conclusions people draw from the architecture are often wrong.