Remix.run Logo
jqpabc123 a day ago

Another possibility not really addressed here --- local LLMs.

AI on hardware you own and control --- instead of a metered service provider. In other words, a repeat of the "personal computing" revolution but this time focused on AI.

TurboQuant could be a key step in this direction.

schnitzelstoat a day ago | parent | next [-]

Yeah, I don't think local LLM's will keep up with what the massive corporations put out. But they might get to a level of performance where it just doesn't matter for most users.

And people would prefer to run a model locally for 'free' (not counting the energy cost) rather than paying for an LLM subscription.

zozbot234 a day ago | parent | prev | next [-]

TurboQuant helps KV quantization which is not very relevant to local LLMs, since context size becomes most relevant when you run inference with large batches. For small-scale inference, weights dominate. (Even if you stream weights from SSD, you'll want to cache a sizeable fraction to get workable throughput, and that dominates your memory usage.)

netdevphoenix a day ago | parent | prev [-]

Local LLMs don't sound profitable at all for those building them. If you really wanted a SOTA model, you would be paying eye watering amounts to own it unless you got an open sourced one.

jqpabc123 a day ago | parent [-]

unless you got an open sourced one.

Ding, ding, ding --- we have a winner.

https://techstartups.com/2026/03/26/nvidia-backed-ai-startup...

https://tiiny.ai/