| ▲ | NotOscarWilde 2 days ago | |
> Is the lock structuring here really deadlock safe? The model will tell you with complete confidence its code is perfect Fully agree, in fact, this has literally happened to me a week ago -- ChatGPT was confidently incorrect about its simple lock structure for my multithreaded C++ program, and wrote paragraphs upon paragraphs about how it works, until I pressed it twice about a (real) possibility of some operations deadlocking, and then it folded. > Every time a major announcement comes out saying so-and-so model is now a triple Ph.D programming triathlon winner, I try using it. Every time it’s the same - super fast code generation, until suddenly staggering hallucinations. As an university assistant professor trying to keep up with AI while doing research/teaching as before, this also happens to me and I am dismayed by that. I am certain there are models out there that can solve IMO and generate research-grade papers, but the ones I can get easy access to as a customer routinely mess up stuff, including: * Adding extra simplifications to a given combinatorial optimization problem, so that its dynamic programming approach works. * Claiming some inequality is true but upon reflection it derived A >= B from A <= C and C <= B. (This is all ChatGPT 5, thinking mode.) You could fairly counterclaim that I need to get more funding (tough) or invest much more of my time and energy to get access to models closer to what Terrence Tao and other top people trying to apply AI in CS theory are currently using. But at least the models cheap enough for me to get access as a private person are not on par with what the same companies claim to achieve. | ||