Remix.run Logo
altern8 6 hours ago

Every time I read these titles, I wonder if people are for some reason pushing the narrative that Claude is way smarter than it really is, or if I'm using it wrong.

They want me to code AI-first, and the amount of hallucinations and weird bugs and inconsistencies that Claude produces is massive.

Lots of code that it pushes would NOT have passed a human/human code review 6 months ago.

themafia 5 hours ago | parent | next [-]

It's always the inconsistencies which amaze me, from the article:

> I have so many bugs in the Linux kernel that I can’t report because I haven’t validated them yet

You have "so many?" Are they uncountable for some reason? You "haven't validated" them? How long does that take?

> found a total of five Linux vulnerabilities

And how much did it cost you in compute time to find those 5?

These articles are always fantastically light on the details which would make their case for them. Instead it's always breathless prognostication. I'm deeply suspicious of this.

pixl97 5 hours ago | parent | next [-]

>And how much did it cost you in compute time to find those 5?

This is the last thing I'd worry about if the bug is serious in any way. You have attackers like nation states that will have huge budgets to rip your software apart with AI and exploit your users.

Also there have been a number of detailed articles about AI security findings recently.

spzb 5 hours ago | parent | prev | next [-]

I'd be interested in how it compares (in terms of time, money and false positives) with fuzzing.

xvector 4 hours ago | parent | prev [-]

You are suspicious because you probably haven't worked anywhere that's AI-first. Anyone that's worked at a modern tech company will find this absolutely believable.

Like what, you expect Nicholas to test each vuln when he has more important work to do (ie his actual job?)

chrisra 5 hours ago | parent | prev | next [-]

What models are you using, on what type of codebases, with what tools?

kakacik 5 hours ago | parent | prev [-]

Apart from obvious PR (if you would need to lean into AI wave a bit this of all places is it) and fanboyism which is just part of human nature, why can't both be true?

It can properly excel in some things while being less than helpful in others. These are computers from the beginning, 1000x rehashed and now with an extra twist.