| ▲ | gpm 8 hours ago | |
Yes, but with current architectures world knowledge is baked into the weights. We might stop figuring out how to make models better, but the world keeps changing, science is going to keep making progress at understanding the world, etc. This creates a significant minimum rate of change and I'm pretty skeptical that it's worth baking weights into silicon as a result. | ||
| ▲ | post-it an hour ago | parent | next [-] | |
This already isn't the case for the popular models. The knowledge baked into the weights tells the model how to talk and reason, but for world knowledge they do a web search right off the bat most of the time. | ||
| ▲ | Micrococonut 6 hours ago | parent | prev | next [-] | |
I think it would just be an opportunity to sell another chip a few years down the line. If the utility curve flattens out on the performance of models I can see a future where you are buying an up to date chip every few years to upgrade to the latest and greatest, while providing up to date context as part of the user input. Like if I have a programming task and I supply a copy of up-to-date documentation alongside my input, I would think that I could still get good output out of a dated model. | ||
| ▲ | Chu4eeno 8 hours ago | parent | prev | next [-] | |
That's why we have reasoning/CoT LLMs that can use tools to get updated information. | ||
| ▲ | cruffle_duffle 6 hours ago | parent | prev [-] | |
I mean it just depends on the price of the chip. You might just replace the chip like you would any other component. Like a video game cartridge or something. | ||