Remix.run Logo
giancarlostoro 13 hours ago

They're also not focused exclusively only on building an LLM, they have video and image generation too. Anthropic has one single focus, and this is why they are usually at the very top in the SWE benchmarks.

phillipcarter 13 hours ago | parent | next [-]

Isn't it the case that OpenAI and Anthropic regularly just swap for whoever is at the top of the latest benchmarks? They're also so close in scores that it's effectively a wash anyways.

What OP is referring to is Anthropic aligning with corporate terms and conditions early, positioning themselves to be effectively resold by AWS rather than requiring orgs to procure them directly. This is huge in the enterprise world because the processes to get broad approval are generally far smaller and shorter for "just another AWS service" compared to a whole new vendor.

djtriptych 11 hours ago | parent | next [-]

OpenAI did teh same thing with Microsoft/Azure though.

Grimblewald 11 hours ago | parent | prev [-]

Isn't it an open secret that benchmarks are largly irrelevant at this point? Why else we do all have a personalized test battery for new models? That said i've stopped testing chatgpt entierly. Its still ok but is beaten by local models and it gets thrashed by non oai frontier providers. I get the history, but holding up oai outputs as equivallent is lile comparing yahoo to google post yahoo's collapse in search domains.

Oai language models are largly irrelevant at this point imo.

epistasis 13 hours ago | parent | prev | next [-]

IMHO the benchmarks aren't useful, and ranking among the frontier models is mostly noise. The extra features around the coding agent have a much bigger impact on productivity than having to provide slightly more specification and guidance to the models; a 90% success rate versus a 92% success rate on the tasks I ask it to do is far more influenced by what I say than what the model is capable of.

DrewADesign 11 hours ago | parent | prev | next [-]

Didn’t they say Sora will only be used to internally create training data? Integrated image generation seems more in the neat feature category than some fundamental advantage, but maybe someone has use cases I haven’t considered.

hn_throwaway_99 11 hours ago | parent | prev [-]

Open AI is killing Sora though, so it looks like they are looking at Anthropic's playbook of focusing on enterprise use cases and seeing that it's more profitable.

dannyw an hour ago | parent [-]

But then they released gpt-image-2, which is clearly SOTA.