Remix clone Hacker News

new | show | ask | jobs Github

	▲	navar 6 hours ago
		For the retrieval stage, we have developed a highly efficient, CPU-only-friendly text embedding model: https://huggingface.co/MongoDB/mdbr-leaf-ir It ranks #1 on a bunch of leaderboards for models of its size. It can be used interchangeably with the model it has been distilled from (https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1...). You can see an example comparing semantic (i.e., embeddings-based) search vs bm25 vs hybrid here: http://search-sensei.s3-website-us-east-1.amazonaws.com (warning! It will download ~50MB of data for the model weights and onnx runtime on first load, but should otherwise run smoothly even on a phone) This mini app illustrates the advantage of semantic vs bm25 search. For instance, embedding models "know" that j lo refers to jennifer lopez. We have also published the recipe to train this type of models if you were interested in doing so; we show that it can be done on relatively modest hardware and training data is very easy to obtain: https://arxiv.org/abs/2509.12539
	▲	jasonjmcghee an hour ago \| parent [-]
		How does performance (embedding speed and recall) compare to minish / model2vec static word embeddings?