Remix.run Logo
MagicMoonlight 5 hours ago

There was a startup posted here which built custom hardware that let the AI respond instantly. Thousands of tokens per second.

tln 4 hours ago | parent | next [-]

Taalas. A sibling comment of yours posted the chat demo URL -

https://chatjimmy.ai/

2ndorderthought 3 hours ago | parent [-]

Woah. How is this working? It's stupid fast.

Grosvenor 4 hours ago | parent | prev | next [-]

cerebras

They built an entire wafer ASIC. The entire thing is one huge active ASIC. it takes a lot of cool engineering and cooling to make it work, and is very cool.

zargon 5 hours ago | parent | prev [-]

Groq.

beavisringdin 4 hours ago | parent [-]

No, it was a custom ASIC chip with weights baked in for a singular model. I do envision a future where we return to cartridges. Local AI is de facto and massively optimised chips are built to be plug and play running a single SoTA model.

SJMG 4 hours ago | parent | next [-]

Likely https://taalas.com

2 hours ago | parent | prev [-]
[deleted]