| ▲ | nl 2 hours ago | |
> Speculation is that the frontier models are all below 200B parameters Some versions of some the models are around that size, which you might hit for example with the ChatGPT auto-router. But the frontier models are all over 1T parameters. Source: watch interview with people who have left one of the big three labs and now work at the Chinese labs and are talking about how to train 1T+ models. | ||