| ▲ | hu3 3 days ago | |||||||
> we found that GPT-5.2 cannot even compute the parity of a short string like 11000, and GPT-5.2 cannot determine whether the parentheses in ((((()))))) are balanced. I think there is a valid insight here which many already know: LLMs are much more reliable at creating scripts and automation to do certain tasks than doing these tasks themselves. For example if I provide an LLM my database schema and tell it to scan for redundant indexes and point out wrong naming conventions, it might do a passable but incomplete job. But if I tell the LLM to code a python or nodejs script to do the same, I get significantly better results. And it's often faster too to generate and run the script than to let LLMs process large SQL files. | ||||||||
| ▲ | plagiarist 3 days ago | parent | next [-] | |||||||
The dream is probably that the inference software then writes and executes that script without using text generation alone. Analog to how a human might cross off pairs of parentheses to check that example. | ||||||||
| ||||||||
| ▲ | whateveracct 2 days ago | parent | prev [-] | |||||||
That's because abstraction is compression of information. | ||||||||