▲ | christianqchung 6 days ago | |
But Llama 4 Scout does badly on long context benchmarks despite claiming 10M. It scores 1 slot above Llama 3.1 8B in this one[1]. | ||
▲ | omneity 6 days ago | parent [-] | |
Indeed, but it does not take away the fact that long context is not trained through long content but by scaling short content instead. |