Remix.run Logo
mike_hearn an hour ago

The blog answers all those questions. It says they're working on fabbing a reasoning model this summer. It also says how long they think they need to fab new models, and that the chips support LoRAs and tweaking context window size.

I don't get these posts about ChatJimmy's intelligence. It's a heavily quantized Llama 3, using a custom quantization scheme because that was state of the art when they started. They claim they can update quickly (so I wonder why they didn't wait a few more months tbh and fab a newer model). Llama 3 wasn't very smart but so what, a lot of LLM use cases don't need smart, they need fast and cheap.

Also apparently they can run DeepSeek R1 also, and they have benchmarks for that. New models only require a couple of new masks so they're flexible.