The post is a clear example of when YAGNI backfires, because you think YAGNI but then, you actually do need it. I had this experience, the author had this experience, you might as well - the things you think you AGN are actually pretty basic expectations and not luxuries: being able to write vectors real-time without having to run other processes out of band to keep the recall from degrading over time, being able to write a query that uses normal SQL filter predicates and similarity in one go for retrieval. These things matter and you won't notice that they actually don't work at scale until later on!

▲

simonw 3 days ago | parent | next [-]

That's not YAGNI backfiring.

The point of YAGNI is that you shouldn't over-engineer up front until you've proven that you need the added complexity.

If you need vector search against 100,000 vectors and you already have PostgreSQL then pgvector is a great YAGNI solution.

10 million vectors that are changing constantly? Do a bit more research into alternative solutions.

But don't go integrating a separate vector database for 100,000 vectors on the assumption that you'll need it later.

▲

Fripplebubby 3 days ago | parent [-]

I think the tricky thing here is that the specific things I referred to (real time writes and pushing SQL predicates into your similarity search) work fine at small scale in such a way that you might not actually notice that they're going to stop working at scale. When you have 100,000 vectors, you can write these SQL predicates (return the 5 top hits where category = x and feature = y) and they'll work fine up until one day it doesn't work fine anymore because the vector space has gotten large. So, I suppose it is fair to say this isn't YAGNI backfiring, this is me not recognizing the shape of the problem to come and not recognizing that I do, in fact, need it (to me that feels a lot like YAGNI backfiring, because I didn't think I needed it, but suddenly I do)

▲

morshu9001 3 days ago | parent | next [-]

If the consequence of being wrong about the scalability is that you just have to migrate later instead of sooner, that's a win for YAGNI. It's only a loss if hitting this limit later causes service disruption or makes the migration way harder than if you'd done it sooner.

▲

simonw 3 days ago | parent [-]

And honestly, even then YAGNI might still win.

There's a big opportunity cost involved in optimizing prematurely. 9/10 times you're wasting your time, and you may have found product-market fit faster if you had spent that time trying out other feature ideas instead.

If you hit a point where you have to do a painful migration because your product is succeeding that's a point to be celebrated in my opinion. You might never have got there if you'd spent more time on optimistic scaling work and less time iterating towards the right set of features.

▲

Fripplebubby 3 days ago | parent | next [-]

I think I see this point now. I thought of YAGNI as, "don't ever over-engineer because you get it wrong a lot of the time" but really, "don't over-engineer out of the gate and be thankful if you get a chance to come back and do it right later". That fits my case exactly, and that's what we did (and it wasn't actually that painful to migrate).

	▲	kevstev 3 days ago \| parent \| next [-]
		At my last job I took over eng at a Series B startup, and my (non-technical) CEO was an ill tempered type and pretty much wanted me to tell him that the entire tech stack was shit and the previous architect/pseudo head of eng was shit, etc. And I was like no... some tradeoffs were made that make a ton of sense for an early stage startup, and the great news is that you are still here and now have the revenue and customer base to start thinking in terms of building things for the next 3-5 years, even though some of things are starting to break. And even better, nothing was so dire that it required stopping the world, we could continue to build and shore up some of the struggling things at the same time. He seemed to really want me to blame everything on my predecessor and call some kind of crisis, and seemed annoyed by my analysis, which was confusing at the time. But yeah, there are absolutely tradeoffs you make early in a startups life, you just have to know where to take shortcuts and where you at least leave the architecture open to scaling. My biggest critique is that they were at least a year, if not two, past the point where they should have left ultra scrappy startup mode that just throws things at the wall and started building with a longer view. I have also seen a friend build out a flawless architecture ready to scale to millions of users, but never got close to a product fit. I felt he wasted at least 6 months building out all this infra scaffolding for nothing.
	▲	simonw 3 days ago \| parent \| prev [-]
		Yeah, that's a great way of putting it.

▲

morshu9001 3 days ago | parent | prev [-]

Yeah the "only if" is more like a "necessary, not sufficient." The future migration pain had better be extremely bad to worry about it so far in advance.

Or it should be a well defined problem. It's easier to determine the right solution after you've already encountered the problem, maybe in a past project. If you're unsure, just keep your options open.

	▲	simonw 3 days ago \| parent [-]
		A few years ago I coined the term PAGNI for "Probably Are Gonna Need It" to cover things that are worth putting in there from the start because they're relatively cheap to implement early but quite expensive to add later on: https://simonwillison.net/2021/Jul/1/pagnis/

▲

hobofan 3 days ago | parent | prev [-]

> When you have 100,000 vectors [...] and they'll work fine

So 95% of use-cases.

	▲	Jnr 2 days ago \| parent \| next [-]
		I think Immich (Google photos alternative) uses pgvector. And while you can't really call it a "production" system, because it is self hosted, I have about 100,000 assets there and the vector search works great!
	▲	samus 2 days ago \| parent \| prev [-]
		In that case you might not even really need optimized vector search though.

▲

throwway120385 3 days ago | parent | prev | next [-]

Many of the concerns in the article could be addressed by standing up a separate PG database that's used exclusively for vector ops and then not using it for your relational data. Then your vector use cases get served from your vector DB and your relational use cases get served from your relational DB. Separating concerns like that doesn't solve the underlying concern but it limits the blast radius so you can operate in a degraded state instead of falling over completely.

	▲	SoftTalker 3 days ago \| parent \| next [-]
		I've always tried to separate transactional databases from those supporting analytical queries if there's going to be any question that there might be contention. The latter often don't need to be real-time or even near-time.
	▲	samus 2 days ago \| parent \| prev \| next [-]
		That is a workaround and precisely the point the author makes. It increases operational complexity and creates a divide between records in the vector DB and the relational DB.
	▲	anentropic 2 days ago \| parent \| prev [-]
		But if you do that, why use Postgres for the vector db?

▲

3 days ago | parent | prev [-]

[deleted]