| ▲ | jqpabc123 a day ago | |||||||
Another possibility not really addressed here --- local LLMs. AI on hardware you own and control --- instead of a metered service provider. In other words, a repeat of the "personal computing" revolution but this time focused on AI. TurboQuant could be a key step in this direction. | ||||||||
| ▲ | schnitzelstoat a day ago | parent | next [-] | |||||||
Yeah, I don't think local LLM's will keep up with what the massive corporations put out. But they might get to a level of performance where it just doesn't matter for most users. And people would prefer to run a model locally for 'free' (not counting the energy cost) rather than paying for an LLM subscription. | ||||||||
| ▲ | zozbot234 a day ago | parent | prev | next [-] | |||||||
TurboQuant helps KV quantization which is not very relevant to local LLMs, since context size becomes most relevant when you run inference with large batches. For small-scale inference, weights dominate. (Even if you stream weights from SSD, you'll want to cache a sizeable fraction to get workable throughput, and that dominates your memory usage.) | ||||||||
| ▲ | netdevphoenix a day ago | parent | prev [-] | |||||||
Local LLMs don't sound profitable at all for those building them. If you really wanted a SOTA model, you would be paying eye watering amounts to own it unless you got an open sourced one. | ||||||||
| ||||||||