Remix.run Logo
scrollaway 3 days ago

> These models routinely make basic mistakes, yet can answer these devilish lateral thinking questions more than 9 times out of 10?

You could also say "These models routinely make basic mistakes, yet they're able to one-shot write entire webpages and computer programs that compile with no errors".

There are classes of mistakes the models make, this is what we're digging into.