Remix clone Hacker News

new | show | ask | jobs Github

	▲	alanwli 3 hours ago
		Out of curiosity, how is the 92% recall calculated? For a given query, is the recall compared to the true topk of all 100B vectors vs. recall at each of N shards compared to the topk of each respective shard?
	▲	nvanbenschoten 2 hours ago \| parent [-]
		(author here) The 92% mentioned in this post is showing recall@10 across all 100B vectors, calculated by comparing to the global top_k. turbopuffer will also continuously monitor production recall at the per-shard level (or on-demand with https://turbopuffer.com/docs/recall). Perhaps counterintuitively, the global recall will actually be better than the per-shard recall if each shard is asked for its own, local top_k!