| ▲ | candiddevmike 3 hours ago | ||||||||||||||||||||||
I wish these folks would tell me how you would do a reproducible build, or reproducible anything really, with LLMs. Even monkeying with temperature, different runs will still introduce subtle changes that would change the hash. | |||||||||||||||||||||||
| ▲ | mvr123456 3 hours ago | parent | next [-] | ||||||||||||||||||||||
This reminds me of how you can create fair coins from biased ones and vice versa. You toss your coin repeatedly, and then get the singular "result" in some way by encoding/decoding the sequence. Different sequences might map to the same result, and so comparing results is not the same as comparing the sequences. Meanwhile, you press the "shuffle" button, and code-gen creates different code. But this isn't necessarily the part that's supposed to be reproducible, and isn't how you actually go about comparing the output. Instead, maybe two different rounds of code-generation are "equal" if the test-suite passes for both. Not precisely the equivalence-class stuff parent is talking about, but it's simple way of thinking about it that might be helpful | |||||||||||||||||||||||
| ▲ | cjbgkagh 3 hours ago | parent | prev [-] | ||||||||||||||||||||||
There is nothing intrinsic to LLM prevents reproducibility. You can run them deterministically without adding noise, it would just be a lot slower to have a deterministic order of operations, which takes an already bad idea and makes it worse. | |||||||||||||||||||||||
| |||||||||||||||||||||||