Remix clone Hacker News

new | show | ask | jobs Github

	▲	deepsquirrelnet a day ago
		I’m finishing up a language identification model that runs on cpu, 70k texts/s single thread, 13mb model artifact and 148 supported languages (though only ~100 have good accuracy). This is a model trained as static embeddings from the gemma 3 token embeddings. https://github.com/dleemiller/WordLlamaDetect