▲ | simianwords 3 days ago | ||||||||||||||||||||||||||||
How does "supporting 1M tokens" really work in practice? Is it a new model? Or did they just remove some hard coded constraint? | |||||||||||||||||||||||||||||
▲ | eldenring 3 days ago | parent [-] | ||||||||||||||||||||||||||||
Serving a model efficiently at 1M context is difficult and could be much more expensive/numerically tricky. I'm guessing they were working on serving it properly, since its the same "model" in scores and such. | |||||||||||||||||||||||||||||
|