I noticed in the README that each commit message includes the agent and model, which is a nice start toward reproducibility.

I’m wondering how deep you plan to go on environment pinning beyond that. Is the system prompt / agent configuration versioned? Do you record tool versions or surrounding runtime context?

My mental model is that reproducible intent requires capturing the full "execution envelope", not just the human prompt + model & agent names. Otherwise it becomes more of an audit trail (which is also a good feature) than something you can deterministically re-run.

Curious how you’re thinking about that.

▲

scheme271 an hour ago | parent [-]

LLMs are non-deterministic so I don't see how it's reproducible.

	▲	50lo 4 minutes ago \| parent [-]
		That’s fair - strict determinism isn’t possible in the traditional sense. I was thinking more along the lines of bounded reproducibility. If the model, parameters, system prompt, and toolchain are pinned, you might not get identical output, but you can constrain the space of possible diffs. It reminds me a bit of how StrongDM talks about reproducibility in their “Digital Twin” concept - not bit-for-bit replay, but reproducing the same observable behavior.