That's where it's at. I'm using the 1600D vectors from OpenAI models for findsight.ai, stored SuperBit-quantized. Even without fancy indexing, a full scan (1 search vector -> 5M stored vectors), takes less than 40ms. And with basic binning, it's nearly instant.

▲

tacoooooooo 3 days ago | parent [-]

this is at the expense of precision/recall though isn't it?

	▲	summarity 3 days ago \| parent \| next [-]
		With the quant size I'm using, recall is >95%.
	▲	pclmulqdq 3 days ago \| parent \| prev [-]
		Approximate nearest neighbor searches don't cost precision. Just recall.