Remix.run Logo
PhunkyPhil 2 hours ago

Obligatory taalas mention:

https://taalas.com/

Despite the performative UI components they have a shipped (demo) product:

https://chatjimmy.ai/

This is only 3.1 8B and a very small context window, but at 17k tokens per second it's likely enough to reliably call tools which would make a huge difference in agentic applications. Assuming they can bake in better models I'm just as bullish or even moreso on this, considering this opens up edge computing at the extremely low power requirement.

High tok/s is the future IMO.

kilroy123 31 minutes ago | parent [-]

My dream is claude or codex running at this speed.