| ▲ | ben_w 12 hours ago | |
Machine learning today requires an obscene quantity of examples to learn anything. SOTA LLMs show quite a lot of skill, but they only do so after reading a significant fraction of all published writing (and perhaps images and videos, I'm not sure) across all languages, in a world whose population is 5 times higher than the link's cut off date, and the global literacy went from 20% to about 90% since then. Computers can only make up for this by being really really fast: what would take a human a million or so years to read, a server room can pump through a model's training stage in a matter of months. When the data isn't there, reading what it does have really quickly isn't enough. | ||