Remix.run Logo
loudmax 7 hours ago

What you describe is very similar to my own experience first running llama.cpp on my desktop computer. It was slow and inaccurate, but that's beside the point. What impressed me was that I could write a question in English, and it would understand the question, and respond in English with an internally coherent and grammatically correct answer. This is coming from a desktop, not a rack full of servers in some hyperscaler's datacenter. This was like meeting a talking dog! The fact that what it says is unreliable is completely beside the point.

I think you still need to calibrate your expectations for what you can get from consumer grade hardware without a powerful GPU. I wouldn't look to a local LLM as a useful store of factual knowledge about the world. The amount of stuff that it knows is going to be hampered by the smaller size. That doesn't mean it can't be useful, it may be very helpful for specialized domains, like coding.

I hope and expect that over the next several years, hardware that's capable of running more powerful models will become cheaper and more widely available. But for now, the practical applications of local models that don't require a powerful GPU are fairly limited. If you really want to talk to an LLM that has a sophisticated understanding of the world, you're better off using Claude or Gemeni or ChatGPT.