| ▲ | Program-of-Thought Prompting Outperforms Chain-of-Thought by 15% (2022)(arxiv.org) | |||||||
| 31 points by mkagenius 2 hours ago | 6 comments | ||||||||
| ▲ | jey an hour ago | parent | next [-] | |||||||
This seems to be incorporated into current LLM generations already -- when code execution is enabled both GPT-5.x and Claude 4.x automatically seem to execute Python code to help with reasoning steps. | ||||||||
| ||||||||
| ▲ | jhart99 an hour ago | parent | prev | next [-] | |||||||
Underlying paper is from 2022 and should be indicated in the title. | ||||||||
| ▲ | axiom92 28 minutes ago | parent | prev | next [-] | |||||||
And even before this work, there was "PAL: Program-aided Language Models" (https://arxiv.org/abs/2211.10435, https://reasonwithpal.com/). Afaik PaLM (Google's OG big models) tried this trick, but it didn't work for them. I think it's because PaL used descriptive inline comments + meaningful variable names. Compare the following: ```python # calculate the remaining apples apples_left = apples_bought - apples_eaten ``` vs. ```python x = y - z ``` We have ablations in https://arxiv.org/abs/2211.10435 showing that both are indeed useful (see "Crafting prompts for PAL"). | ||||||||
| ▲ | mgraczyk 39 minutes ago | parent | prev | next [-] | |||||||
Anthropic recently added this to the API: https://www.anthropic.com/engineering/advanced-tool-use See "Programmatic Tool Calling" And there was an AI productivity startup called Lutra AI doing this, although they've since pivoted to some kind of MCP infra thing: https://lutra.ai/ | ||||||||
| ▲ | eric-burel 43 minutes ago | parent | prev [-] | |||||||
I call that self-destructive prompting in the sense that you use AI to output programs that replace calling the AI in the future. The paper seems to indicate that this also brings much better results. However it's subject to attacks as running generated code is usually unsafe. A sandbox has to be used, major agentic AI players are providing some solutions, like Langchain sandbox released earlier this year. | ||||||||