| ▲ | rvz 4 hours ago | |
Fast, but stupid.
The question is not about how fast it is. The real question(s) are:
(This also assumes that diffusion LLMs will get faster) | ||
| ▲ | mike_hearn 15 minutes ago | parent | next [-] | |
The blog answers all those questions. It says they're working on fabbing a reasoning model this summer. It also says how long they think they need to fab new models, and that the chips support LoRAs and tweaking context window size. I don't get these posts about ChatJimmy's intelligence. It's a heavily quantized Llama 3, using a custom quantization scheme because that was state of the art when they started. They claim they can update quickly (so I wonder why they didn't wait a few more months tbh and fab a newer model). Llama 3 wasn't very smart but so what, a lot of LLM use cases don't need smart, they need fast and cheap. | ||
| ▲ | simlevesque an hour ago | parent | prev | next [-] | |
LLMs can't count. They need tool use to answer these questions accurately. | ||
| ▲ | refsys 2 hours ago | parent | prev [-] | |
[dead] | ||