This is doable, I have a multi-stage process that makes it pretty reliable. Stage 1 is ideation, this can be with a LLM or humans, w/e, you just need a log. Stage 2 is conversion of that ideation log to a simple spec format that LLMs can write easily called SRF, which is fenced inside a nice markdown document humans can read and understand. You can edit that SRF if desired, have a conversation with the agent about it to get them to massage it, or just have your agent feed it into a tool I wrote which takes a SRF and converts it to a CUE with full formal validation and lots of other nice features.

The value of this is that FOR FREE you can get comprehensive test defintions (unit+e2e), kube/terraform infra setup, documentation stubs, openai specs, etc. It's seriously magical.

▲

sarchertech 5 days ago | parent [-]

Imagine that when you deploy you have an LLM that regenerates code based on your specs, since code is fungible as long as it fits the spec.

Keeping in mind that I have seen hundreds to thousands of production errors in applications with very high coverage test suites?

How many production errors would you expect to see over 5 years of LLM deployments.

▲

CuriouslyC 5 days ago | parent [-]

Maybe more, but agents can also respond to errors much more quickly, and agents have the ability to manage small staged rollouts with monitoring to catch an issue before it goes global, so your actual downtime will be much better at the same time that you're producing code way faster.

	▲	sarchertech 4 days ago \| parent [-]
		My magic potion will make you much faster. It will also make you crap your pants all the time. But don’t worry I have a magic charm that I can sell that will change colors to alert you that you’ve shat yourself, and since you’re faster you can probably change your pants before anyone important notices.