| ▲ | maxbond 6 hours ago | |||||||||||||||||||||||||||||||
Reminds me of the recent paper about delegating document editing tasks to LLMs across different disciplines [1]. That paper found that programming was the only discipline most LLMs can perform long horizon tasks on without accumulating errors & corrupting the document. I've only read the abstract of this one so far but it seems like this paper has zoomed in on programming with greater fidelity and shown a similar phenomenon. But not about long horizon tasks, more like "long style horizons" of larger sets of structural constraints. [1] https://arxiv.org/abs/2604.15597 Discussion: https://news.ycombinator.com/item?id=48073246 | ||||||||||||||||||||||||||||||||
| ▲ | emp17344 5 hours ago | parent [-] | |||||||||||||||||||||||||||||||
If it’s not easily verifiable, LLMs aren’t good at it. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||