Remix.run Logo
ryneandal 5 days ago

I haven't done exhaustive testing of all top-performing models on the HF Embedding Leaderboard (https://huggingface.co/spaces/mteb/leaderboard) but I have tested a number of them extensively in the past month or two. The two best API provider models I've tested are:

- JinaAI (https://jina.ai/embeddings/) v3 and v4 performed well in my testing. - Google's Gemini-001 model (https://ai.google.dev/gemini-api/docs/models#gemini-embeddin...).

Overall, both were surpassed by Qwen3-8b (https://huggingface.co/Qwen/Qwen3-Embedding-8B).

Note, this was specifically regarding English and Code embedding generation/retrieval, with reranking.