Remix.run Logo
bambax a day ago

> Each model appears to emphasize a different balance between reasoning and execution. Rather than seeking one “best” system, developers are assembling model alloys—ensembles that select the cognitive style best suited to a task.

This (as well as the table above it) matches my experience. Sonnet 4.0 answers SO-type questions very fast and mostly accurately (if not on a niche topic), Sonnet 4.5 is a little bit more clever but can err on the side of complexity for complexity's sake, and can have a hard time getting out of a hole it dug for itself.

ChatGPT 5 is excellent at finding sources on the web; Gemini simply makes stuff up and continues to do so even when told to verify; ChatGPT provides link that work and are generally relevant.