▲ | minimaxir a day ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Context window is unchanged for Sonnet. (200k in/64k out): https://docs.anthropic.com/en/docs/about-claude/models/overv... In practice, the 1M context of Gemini 2.5 isn't that much of a differentiator because larger context has diminishing returns on adherence to later tokens. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | Rudybega a day ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
I'm going to have to heavily disagree. Gemini 2.5 Pro has super impressive performance on large context problems. I routinely drive it up to 4-500k tokens in my coding agent. It's the only model where that much context produces even remotely useful results. I think it also crushes most of the benchmarks for long context performance. I believe on MRCR (multi round coreference resolution) it beats pretty much any other model's performance at 128k at 1M tokens (o3 may have changed this). | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | zamadatix a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
The amount of degradation at a given context length isn't constant though so a model with 5x the context can either be completely useless or still better depending on the strength of the models you're comparing. Gemini actually does really great in both regards (context length and quality at length) but I'm not sure what a hard numbers comparison to the latest Claude models would look like. A good deep dive on the context scaling topic in general https://youtu.be/NHMJ9mqKeMQ | |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | Workaccount2 a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Gemini's real strength is that it can stay on the ball even at 100k tokens in context. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | michaelbrave a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
I've had a lot of fun using Gemini's large context. I scrape a reddit discussion with 7k responses, and have gemini synthesize it and categorize it, and by the time it's done and I have a few back and fourths with it I've gotten half of a book written. That said I have noticed that if I try to give it additional threads to compare and contrast once it hits around the 300-500k tokens it starts to hallucinate more and forget things more. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | ashirviskas a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
It's closer to <30k before performance degrades too much for 3.5/3.7. 200k/64k is meaningless in this context. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | strangescript a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Yeah, but why aren't they attacking that problem? Is it just impossible, because it would be a really simple win with regards to coding. I am huge enthusiast, but I am starting to feel a peak. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | VeejayRampay a day ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
that is just not correct, it's a big differentiator |