Remix clone Hacker News

new | show | ask | jobs Github

	▲	jeffhuys 5 hours ago
		Check chatjimmy.ai
	▲	lelandbatey 3 hours ago \| parent [-]
		https://chatjimmy.ai being a demo of the "burn the model to an ASIC" approach being sold by Taalas[0], an approach which they use to run Llama 3.1 8B at ~17000 tokens per second. [0] - https://taalas.com/products/