| ▲ | _pdp_ an hour ago | |
This seems like common sense but it does not work in practice. Prompting is just the first part. To get the outcome, you need to have other systems to steer the agent as it get things wrong. Proper deterministic tests work. But there is also stuff that need to happen during the LLM execution like cyclic detection etc. All of this adds up. You cannot just prompt an LLM an hope for a good outcome. It might work in small isolated scenarios but it just does not work consistently enough to call it reliable. Without further guardrails enforced by the process or the harness, LLMs do not have sufficient capabilities to complete a task up to a certain standard. | ||