Remix.run Logo
throwaw12 7 hours ago

We are getting there, as a next step please release something to outperform Opus 4.5 and GPT 5.2 in coding tasks

gordonhart 7 hours ago | parent | next [-]

By the time that happens, Opus 5 and GPT-5.5 will be out. At that point will a GPT-5.2 tier open-weights model feel "good enough"? Based on my experience with frontier models, once you get a taste of the latest and greatest it's very hard to go back to a less capable model, even if that less capable model would have been SOTA 9 months ago.

cirrusfan 7 hours ago | parent | next [-]

I think it depends on what you use it for. Coding, where time is money? You probably want the Good Shit, but also want decent open weights models to keep prices sane rather than sama’s 20k/month nonsense. Something like a basic sentiment analysis? You can get good results out of a 30b MoE that runs at good pace on a midrange laptop. Researching things online with many sources and decent results I’d expect to be doable locally by the end of 2026 if you have 128GB ram, although it’ll take a while to resolve.

bwestergard 7 hours ago | parent [-]

What does it mean for U.S. AI firms if the new equilibrium is devs running open models on local hardware?

selectodude 7 hours ago | parent [-]

OpenAI isn’t cornering the market on DRAM for kicks…

yorwba 7 hours ago | parent | prev | next [-]

When Alibaba succeeds at producing a GPT-5.2-equivalent model, they won't be releasing the weights. They'll only offer API access, like for the previous models in the Qwen Max series.

Don't forget that they want to make money in the end. They release small models for free because the publicity is worth more than they could charge for them, but they won't just give away models that are good enough that people would pay significant amounts of money to use them.

tosh 7 hours ago | parent | prev | next [-]

It feels like the gap between open weight and closed weight models is closing though.

theshrike79 7 hours ago | parent [-]

Mode like open local models are becoming "good enough".

I got stuff done with Sonnet 3.7 just fine, it did need a bunch of babysitting, but still it was a net positive to productivity. Now local models are at that level, closing up on the current SOTA.

When "anyone" can run an Opus 4.5 level model at home, we're going to be getting diminishing returns from closed online-only models.

cyanydeez 3 hours ago | parent [-]

See, the market is investing like _that will never happen_.

theshrike79 3 hours ago | parent [-]

I'm just riding the VC powered wave of way-too-cheap online AI services and building tools and scaffolding to prepare for the eventual switch to local models =)

thepasch 6 hours ago | parent | prev | next [-]

If an open weights model is released that’s as capable at coding as Opus 4.5, then there’s very little reason not to offload the actual writing of code to open weight subagents running locally and stick strictly to planning with Opus 5. Could get you masses more usage out of your plan (or cut down on API costs).

rglullis 7 hours ago | parent | prev | next [-]

I'm going in the opposite direction: with each new model, the more I try to optimize my existing workflows by breaking the tasks down so that I can delegate tasks to the less powerful models and only rely on the newer ones if the results are not acceptable.

rubslopes 4 hours ago | parent | prev | next [-]

I used to say that Sonnet 4.5 was all I would ever need, but now I exclusively use Opus...

littlestymaar 4 hours ago | parent | prev [-]

> Based on my experience with frontier models, once you get a taste of the latest and greatest it's very hard to go back to a less capable model, even if that less capable model would have been SOTA 9 months ago.

That's the tyranny of comfort. Same for high end car, living in a big place, etc.

There's a good work around though: just don't try the luxury in the first place so you can stay happy with the 9 months delay.

Keyframe 6 hours ago | parent | prev | next [-]

I'd be happy with something that's close or same as opus 4.5 that I can run locally, at reasonable (same) speed as claude cli, and at reasonable budget (within $10-30k).

segmondy 6 hours ago | parent | prev | next [-]

Try KimiK2.5 and DeepSeekv3.2-Speciale

IhateAI 6 hours ago | parent | prev [-]

Just code it yourself, you might surprise yourself :)