| ▲ | ZeroCool2u 5 hours ago | |||||||
Frontier Math, GPQA Diamond, and Browsecomp are the benchmarks I noticed this on. | ||||||||
| ▲ | csnweb 5 hours ago | parent [-] | |||||||
Are you may be comparing the pro model to the non pro model with thinking? Granted it’s a bit confusing but the pro model is 10 times more expensive and probably much larger as well. | ||||||||
| ||||||||