Remix.run Logo
dnnddidiej 2 days ago

Not really. SOTA vs non SOTA is "can I get my coding work actually done today" vs. "this can do customer support chat"

It is like car vs. kick scooter.

regularfry 2 days ago | parent | next [-]

It really isn't. We get coding work actually done today on Opus 4.5. That's not SOTA any more, and anything proximate to that level, even quite loosely, is genuinely useful.

dnnddidiej 2 days ago | parent [-]

OK we are in Opus 4.5 is not SOTA. Right by that definition .... yes you are right.

randomgermanguy 2 days ago | parent [-]

I mean its almost halve a year, i think that counts ?

dnnddidiej a day ago | parent [-]

Time wise you are correct.

randomgermanguy 2 days ago | parent | prev [-]

> "can I get my coding work actually done today" vs. "this can do customer support chat"

I think you need to define "can get coding work done" for this to make sense. Ive been using GPT-3 back-then for basic scripts, does that count ? Or only Claude-Code ?

I also think this is a false dichotomy, if you look at the Project Vend project or Vending-Bench, customer support etc. is at no means trivial. (Old but great story https://www.businessinsider.com/car-dealership-chevrolet-cha...)

UlisesAC4 2 days ago | parent [-]

This, I have been doing my side hustle code with open code an 3.2 reasoner and it is way better than what I have at day job with copilot and whatever models are there.

wahnfrieden a day ago | parent | next [-]

Copilot is a bad harness that perverts the productivity of models like GPT 5.5.

dnnddidiej a day ago | parent | prev [-]

Tell me more please!