Remix.run Logo
barbazoo 4 hours ago

> What that means is that when you're looking to build a fully local RAG setup, you'll need to substitute whatever SaaS providers you're using for a local option for each of those components.

Even starting with having "just" the documents and vector db locally is a huge first step and much more doable than going with a local LLM at the same time. I don't know any one or any org that has the resources to run their own LLM at scale.

mips_avatar 3 hours ago | parent | next [-]

It’s also just extremely viable to just host your own vector db. You just need a server with enough ram for your hnsw index.

procaryote 3 hours ago | parent | prev [-]

Aren't there a bunch of models that run OK on consumer hardware now?

lukan 17 minutes ago | parent [-]

Hopefully my new GPU will arrive tomorrow, then I can confirm myself, but if you look around online, there are lots of private people out there running their own models. A 16 GB GPU starts at 270€, which lets you run something like deepseek r.14, 32 GB GPUs start at 1200 € and then it goes further up, in model quality and price. (Top models require something like 60- 200 GB of GPU memory I think)

So for sure any medium sized company could afford to run their own LLMs, also at scale if they want to make the investment. The question is, how much they value their confidential data. (I would not trust any of the big AI companies). And you don't usually need cutting edge reasoning and coding abilities to process basic information.