These paid offerings geared toward software development must be a hell of a lot "smarter" than the regular chatbots. The amount of nonsense and bad or outright wrong code Gemini and ChatGPT throw at me lately is off the charts. I feel like they are getting dumber.

▲

ghosty141 an hour ago | parent | next [-]

Yes they are, the fact that the agents have full access to your local project files makes a gigantic difference.

They do *very* well at things like: "Explain what this class does" or "Find the biggest pain points of the project architecture".

No comparison to regular ChatGPT when it comes to software development. I suggest trying it out, and not by saying "implement game" but rather try it by giving it clear scoped tasks where the AI doesn't have to think or abstract/generalize. So as some kind of code-monkey.

▲

zitterbewegung 2 hours ago | parent | prev | next [-]

I don’t understand why we are getting these software products that want to have vendor lock in when the underlying system isn’t being improved. I prefer Claude code right now because it’s a better product . Gemini just has a weird context window that poisons the rest of the code generated (when online) ChatGPT Codex vs Claude I feel that Claude is a better product and I don’t use enough tokens to for Claude Pro at $100 and just have a regular ChatGPT subscription for productivity tasks .

▲

nkohari an hour ago | parent [-]

> I don’t understand why we are getting these software products that want to have vendor lock in when the underlying system isn’t being improved.

I think it's clear now that the pace of model improvements is asymptotic (or at least it's reached a local maxima) and the model itself provides no moat. (Every few weeks last year, the perception of "the best model" changed, based on basically nothing other than random vibes and hearsay.)

As a result, the labs are starting to focus on vertical integration (that is, building up the product stack) to deepen their moat.

▲

anematode an hour ago | parent [-]

> I think it's clear now that the pace of model improvements is asymptotic

As much as I wish it were, I don't think this is clear at all... it's only been a couple months since Opus 4.5, after all, which many developers state was a major change compared to previous models.

	▲	nkohari an hour ago \| parent [-]
		Like I said, lots of vibes and hearsay! :) The models are definitely continuing to improve; it's more of a question of whether we're reaching diminishing returns. It might make sense to spend $X billion to train a new model that's 100% better, but it makes much less sense to spend $X0 billion to train a new model that's 10% better. (Numbers all made up, obviously.)

▲

mceachen 2 hours ago | parent | prev [-]

It’s the inconsistency that gets me. Very similar tasks, similar complexity, same code base, same prompting:

Session A knocks it out of the park. Chef’s kiss.

Session B just does some random vandalism.