▲ | whs 7 days ago | |||||||
I wrote an MCP based on that technique - https://github.com/whs/mcp-chinesewall Basically to avoid the ambiguity of training LLM from unlicensed code, I use it to generate description of the code to another LLM trained from permissively licensed code. (There aren't any usable public domain models I've found) I use it in real world and it seems that the codegen model work 10-20% of the time (the description is not detailed enough - which is good for "clean room" but a base model couldn't follow that). All models can review the code, retry and write its own implementation based on the codegen result though. | ||||||||
▲ | ghuntley 7 days ago | parent [-] | |||||||
Nice. Any chance you could put in some attributions and credits in your paper? https://orcid.org/0009-0007-3955-9994 | ||||||||
|