Indeed, but it does not take away the fact that long context is not trained through long content but by scaling short content instead.