Remix.run Logo
PeterStuer 9 days ago

While your aims are undoutably sincere, in practice for the 'local ai' target people building their own rigs usually have. 4TB or more fast ssd storage.

The bottom tier (not meant disparagingly) are people running diffusion models as these do not have the high vram requirements. They generate tons of images or video, going form a one-click instally like Easydiffusion to very sophisticated workflows in comfyui.

For those going the LLM route, which would be your target audience, they quickly run into the problemm that to go beyond toying around, the hardware and software requirements and expertise grows exponential beyong just toying around with small, highly quantized model with small context windows.

Inlight of the typical enthusiast investments in this space, the few TB of fast storage will pale in comparison to the rest of the expenses.

Again, your work is absolutely valuable, it is just that the storage space requirement for the vector store in this particular scenario is not your strongest card to play.

imoverclocked 9 days ago | parent | next [-]

Everyone benefits from focusing on efficiency and finding better ways of doing things. Those people with 4TB+ of fast storage can now do more than they could before as can the "bottom tier."

It's a breath of fresh air anytime someone finds a way to do more with less rather than just wait for things to get faster and cheaper.

PeterStuer 9 days ago | parent [-]

Of course. And I am not arguing against that at all. Just like if someone makes an inference runtime that is 4% faster, I'll take that win. But would it be the decisive factor in my choice? Only if that was my bottleneck, my true constraint.

All I tried to convey was that for most of the people in the presented scenario (personal emails etc.) , a 50 or even 500GB storage requirement is not going to be that primary constraint. So the suggestion was the marketing for this usecase might be better spotlighting also something else.

ricardobeat 9 days ago | parent [-]

You are glossing over the fact that for RAG you need to search over those 500GB+ which will be painfully slow and CPU-intensive. The goal is fast retrieval to add data to the LLM context. Storage space is not the sole reason to minimize the DB size.

brookst 9 days ago | parent | next [-]

You’re not searching over 500GB, you’re searching an index of the vectors. That’s the magic of embeddings and vector databases.

Same way you might have a 50TB relational database but “select id, name from people where country=‘uk’ and name like ‘benj%’ might only touch a few MB of storage at most.

ricardobeat 8 days ago | parent [-]

That’s precisely the point I tried to clear up in the previous comment.

The LEANN author proposes to create a 9GB index for a 500GB archive, and the other poster argued that it is not helpful because “storage is cheap”.

9 days ago | parent | prev [-]
[deleted]
brabel 9 days ago | parent | prev [-]

Speak for yourself! If it took me 500GB to store my vectors , on top of all my existing data, it would be a huge barrier for me.

hdgvhicv 9 days ago | parent | next [-]

A 4tb external drive is £100. A 1TB sd card or usb stick a similar cost.

Maybe Im too old to appreciate what “fast” means, but storage doesnt seem an enormous cost once you stripe it.

mockingloris 9 days ago | parent [-]

This "...doesn't seem an enormous cost once you stripe it." gave me an idea. I KNOW that I will come back to link a blog post about it in the future.

xandrius 9 days ago | parent | prev [-]

Maybe time to update your storage?