Remix.run Logo
perching_aix 2 hours ago

Really unsure why this is getting downvoted, to my understanding this is a massive, unsettled concern.

It wasn't even a disasm/pseudocode to formal spec flow, and then a separate human implementation. The same human has been in the loop throughout, and large parts of it were generated directly.

It's basically guaranteed tainted.

Edit: I should have skimmed a bit more patiently, there was in fact no "disasm/pseudocode + the human getting tainted" part to this apparently.

ameliaquining 2 hours ago | parent | next [-]

I read the post you're replying to as saying "this is copyright-encumbered and nonfree because it's a derivative work of everything in Claude's and GPT-5.5's training corpus", which is an argument I find fairly tiresome. (Realistically, if courts actually rule that this is the case, this tiny little project will be the least of anyone's concerns.)

"This is copyright-encumbered and nonfree because it's a derivative work of the legacy RAR binaries" is a different argument (and seems like it depends on details of the setup that were somewhat glossed over in the post).

2 hours ago | parent | next [-]
[deleted]
themafia an hour ago | parent | prev [-]

The point is, excepting current legal standards which are already very murky, how can _you_ claim copyright, if you don't _know_ it isn't encumbered?

You can get these LLMs to generate copyrighted outputs both intentionally and accidentally. This is a known fact; therefore, if you're not checking the output to see if this has occurred then you're potentially generating legal risks for yourself and anyone who uses your code.

To not only ignore this for your own use case but to then release the code under a proclaimed license seems legally problematic if not ethically concerning.

If you did get sued for infringement I can't imagine that your defense would be that you find the argument tiresome? Honestly, do you think this would never happen, or how would you go about defending your actions here?

ameliaquining 18 minutes ago | parent [-]

What do you mean by "checking the output"? Is there some kind of check the author says he didn't do that you think he should have? Or is your claim that using an LLM for coding is always copyright infringement? If so, I think the risk that I'll personally be the test case that resolves whatever ambiguities exist in the law is basically zero, and I don't think derailing the thread to be about that topic enlightens anyone.

charcircuit 2 hours ago | parent | prev [-]

The human wasn't looking at the copyrighted code and was giving high level steering instructions. If you look at the spec generated it doesn't look like a derivative work of the copyrighted material. The program was generated from the spec. It seems mostly fine from my perspective.

0cf8612b2e1e an hour ago | parent [-]

If I use a decompiler on existing binaries, then some machine translation utility to turn that into a different language, that still feels like a derivative work, even if no human were reviewing the specifics.

ameliaquining 14 minutes ago | parent [-]

The idea is to make it so that the parts of the output that are derived from the existing binary are not themselves eligible for copyright protection. I.e., factual descriptions of the file format, without any implementation details from the binary.