Remix clone Hacker News

new | show | ask | jobs Github

	▲	dakolli 7 hours ago
		try here, I hate llms but this is crazy fast. https://chatjimmy.ai/
	▲	bmacho 7 hours ago \| parent \| next [-]
		`"447 / 6144 tokens" "Generated in 0.026s • 15,718 tok/s"` This is crazy fast. I always predicted this speed in ~2 years in the future, but it's here, now.
	▲	Lalabadie 7 hours ago \| parent \| prev \| next [-]
		The full answer pops in milliseconds, it's impressive and feels like a completely different technology just by foregoing the need to stream the output.
	▲	machiaweliczny 5 hours ago \| parent \| prev \| next [-]
		We need that for this chinese 3B model that think 45s for hello world but also solves math.
	▲	FergusArgyll 6 hours ago \| parent \| prev [-]
		Because most models today generate slowish, they give the impression of someone typing on the other end. This is just <enter> -> wall of text. Wild