Remix.run Logo
qsort 8 hours ago

A couple of alternative scenarios, although I'm not sure how much stock we should put in them:

- what if at a certain level of capability you're essentially bug-free? I'm somewhat skeptical that this could be the case in a strong sense, because even if you formally prove certain properties, security often crucially depends on the threat model (e.g. side channel attacks, constant-time etc,) but maybe it becomes less of a problem in practice?

- what if past a certain capability threshold weaker models can substitute for stronger ones if you're willing to burn tokens? To make an example with coding, GPT-3 couldn't code at all, so I'd rather have X tokens with say, GPT 5.4, than 100X tokens with GPT-3. But would I rather have X tokens with GPT 5.4 or 100X tokens with GPT 5.2? That's a bit murkier and I could see that you could have some kind of indifference curve.

Leomuck 8 hours ago | parent | next [-]

Honestly, if every software project ran an AI-based security check over their code, the software world would probably be more secure. Of course, there are lots of projects who don't need that, having skilled people behind it, but we've seen many popular software projects (even by big companies) who didn't care at all. So even a basic model would find issues.

Also, I find myself thinking more and more that the ability to pay for tokens is becoming crucial. And it's unfair. If you don't have money, you don't have access. Somehow, a worsening of class conflicts. If you know what I mean.

serial_dev 8 hours ago | parent [-]

Not only that, even if you would like to pay, the best model providers could decide any day that they want to save on cost, so they nerf the responses. Then you shipping on time is at the mercy of these companies.

If you spend months shipping slop, because “models will get better and tomorrow’s models can fix me today’s slop”, what happens when they not only do not get better, but actually get worse, and you are left with a bunch of slop you don’t understand and your problem solving muscles gotten weak?

Leomuck 6 hours ago | parent [-]

Good point indeed! I've been feeling Claude Code has gotten worse for a while now, read many articles on it, overall probably due to cost saving. But if you set your things up to depend on it, that becomes a huge issue.

nine_k 7 hours ago | parent | prev [-]

> essentially bug-free

I would say that most software is going to have few easily exploitable bugs. Presence of such bugs will immediately cost more than having them discovered and fixed.

Other bugs, those that do not lead to easy pwning of a system, circumventing billing, etc, may linger as much as they currently do.