▲ | imtringued 6 days ago | |
There also is an absolutely massive gap between Llama 2 and Llama 3. The Llama 3.1 models represent the beginning of usable open weight models. Meanwhile Llama 4 and its competitors seem to be incremental improvements. Yes, the newest models are so much better that they obsolete the old ones, but now the biggest differences between models is primarily what they know (parameter count and dataset quality) and how much they spend thinking (compute budget). |