> how do we avoid burning tokens solving the same problems over again

Letting the LLM write half baked tools is the recipe for burning more tokens.

> There's a wiki the LLM searches before solving a problem, that links saved programs for past actions to their content entry.

What's the criteria for marking an LLM written tool as useful/correct before publishing it?

▲ gavinray 3 hours ago | parent [-]

  > Letting the LLM write half baked tools is the recipe for burning more tokens.

It sure is, if the tools are half-baked and your user scale is N=1 rather than N=100 or N=1,000

  > What's the criteria for marking an LLM written tool as useful/correct before publishing it?

It solves the problem the originating user asked it to

	▲	afshinmeh 3 hours ago \| parent [-]
		> It solves the problem the originating user asked it to Interesting. And is there a mechanism to go back and "fix" the tools after they are published? What happens if the tool decided to use the "id" attribute to click on buttons and now you have a new website that follows a different pattern to find the right target? I agree that "correctness" of a tool could have different meaning depending on the context of the problem though (e.g. would you consider OOM a correctness bug even if it addresses the user's ask?)