| ▲ | zozbot234 7 hours ago | |||||||||||||||||||||||||||||||||||||||||||||||||
This seems right to me. If you ask a LLM to derive a spec that has no expressive element of the original code (a clean-room human team can carefully verify this), and then ask another instance of the LLM (with fresh context) to write out code from the spec, how is that different from a "clean room" rewrite? The agent that writes the new code only ever sees the spec, and by assumption (the assumption that's made in all clean room rewrites) the spec is purely factual with all copyrightable expression having been distilled out. But the "deriving the spec (and verifying that it's as clean as possible)" is crucial and cannot be skipped! | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | sigseg1v 7 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
How would a team verify this for any current model? They would have to observe and control all training data. In practice, any currently available model that is good enough to perform this task likely fails the clean room criteria due to having a copy of the source code of the project it wants to rewrite. At that point it's basically an expensive lossy copy paste. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | oytis 7 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
It requires the original project to not be in the training data for the model for it to be a clean room rewrite | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | nneonneo 6 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
Somewhat annoyingly, there's been research that suggests that models can pass information to each other via (effectively) steganographic techniques - specific but apparently harmless choices of tokens, wordings, and so on; see https://arxiv.org/abs/1712.02950 and https://alignment.anthropic.com/2025/subliminal-learning/ for some simple examples. While it feels unlikely that a simple "write this spec from this code" + "write this code from this spec" loop would actually trigger this kind of hiding behaviour, an LLM trained to accurately reproduce code from such a loop definitely would be capable of hiding code details within the spec - and you can't reasonably prove that the frontier LLMs have not been trained to do so. | ||||||||||||||||||||||||||||||||||||||||||||||||||