Remix.run Logo
ramon156 4 hours ago

> Given that the source of code generated by AI is unknown, we can't accept it under the Zlib license.

So what about SO code snippets? I'm not here to make a stance for AI, but this thread is leaning towards biased.

Address the elephant, LLM-assisted PR's have a chance of being lower quality. People are not obligated to review their code. Doing this manually, you are more inclined to review what you're submitting.

I don't get why these conversations always target their opinion, not the facts. I totally agree about the ethicality, the fact it's bound to get monopolized (unless GLM becomes SOTA soon), and is harming the environment. That's my opinion though, and shouldn't interfere with what others do. I don't scoff at people eating meat, let them be.

The issue is real, the solution is not.

johndough 4 hours ago | parent | next [-]

> So what about SO code snippets?

StackOverflow snippets are mostly licensed under CC BY-SA 3.0 or 4.0, so I'd wager that they are not allowed, either.

The SDL source code makes a few references to stackoverflow.com, but the only place I could find an exact copy was where the author explicitly licensed the code under a more permissive license: https://github.com/libsdl-org/SDL/blob/5bda0ccfb06ea56c1f15a...

Sharlin 4 hours ago | parent [-]

Most SO snippets likely aren't unique or creative enough to count as works. If a hundred programmers would write essentially the same snippet to solve a problem, it's not copyrightable.

johndough 4 hours ago | parent | next [-]

I wouldn't be so sure about that. The famous "rangeCheck" function in the Google vs Oracle lawsuit was only 9 lines: https://news.ycombinator.com/item?id=11722514

shevy-java 4 hours ago | parent [-]

I don't think this can be used as a counter-argument.

Most SO contributions are dead-simple; often just being a link to the documentation or an extended example. I mean just have a look at it.

Finding a comparable SO entry that is similar to Google versus Oracle example, is in my opinion much much harder. I have been using SO in the last 10 years a lot for snippets, and most snippets are low quality. (Some are good though; SO still has use cases, even though it kind of aged out now.)

embedding-shape 4 hours ago | parent | prev [-]

> Most SO snippets likely aren't unique or creative enough to count as works.

How is this different from LLM outputs? Literally trained on the output of N programmers so it can give you a snippet of code based on what it has seen.

sdJah18 4 hours ago | parent | prev | next [-]

The "humans do it, too" or "humans have always done it" arguments break down very quickly.

Not only by comparing the scale of infringement, but because direct Stackoverflow snippets are very rare. For example, C++ snippets are 95% code cleverness monstrosities and you can only learn a principle but not use the code directly.

I'd say that Stackoverflow snippets in well maintained open source projects are practically zero. I've never seen any PR that is accepted that would even trigger that suspicion.

rzmmm 4 hours ago | parent [-]

[dead]

LLMCodeAuditor 4 hours ago | parent | prev | next [-]

Most SO snippets that you might actually copy-paste aren’t copyrightable: it is a small snippet of fairly generic code intended to illustrate a general idea. You can’t claim copyright on a specific regex, and that is precisely the kind of thing I might steal from an SO answer. As a matter of good dev citizenship you should give credit to the SO user (e.g. a link in a comment) but it’s almost never a copyright issue. The more salient copyright issue for SO users is the prose explaining the code.

missingdays 4 hours ago | parent | prev [-]

> I don't scoff at people eating meat, let them be.

Why not let the animals be?

crackez 4 hours ago | parent [-]

I'm just happy to be on the food chain at all...