Remix.run Logo
account42 4 hours ago

This is essentially the digital world transforming from a high trust society into a low trust one. Sad to see.

oooyay an hour ago | parent | next [-]

Not even just digital; much of the world is shifting from high trust to low trust as well: https://social.desa.un.org/sites/default/files/inline-files/...

Gormo 2 hours ago | parent | prev [-]

To whom would you attribute the greater part of that reduction in trust: the people using FOSS to train LLMs, or the people trying to block them?

Xirdus an hour ago | parent | next [-]

People who break the social contract are the ones responsible for breaking the social contract, not the ones who take steps in response to social contract being broken.

Gormo an hour ago | parent | next [-]

So the questions here are (a) is any generally accepted social contract actually being broken, and (b) if so, who are the ones who are breaking it?

Xirdus 17 minutes ago | parent | next [-]

Are you asking how AI coding agents, the companies selling them and the individuals using them break the FOSS social contract (copyleft, attribution, upstreaming), or are you disputing that they do?

Gormo 10 minutes ago | parent [-]

Both would resolve to the same question, no?

There seems to be an implicit premise here that any work generated by an LLM whose training data includes a particular bit of code itself constitutes a redistribution of that code. I've yet to encounter any strong arguments substantiating this premise as a general principle, and my own suspicion is that it is not valid as a general principle, given the nature of how LLMs operate.

It's certainly possible that specific instances of LLMs lazily copy-pasting code from public repos may exist, and the extent to which this is happening is something that can be substantiated by empirical examples, so if you have any to point to, I'd be interested in looking at them. However, where this is happening, it ought to be regarded as a failure modality of LLMs, and not something that implicates the underlying nature of LLMs, given that their intended purpose is to function as stochastic generators that do not merely copy-paste input data.

My initial feeling here is that using open-source code to train LLMs is not per se a violation of the generally accepted FOSS social contract, but rather that attempting to restrict specific use cases of FOSS-licensed code on the basis of normative opinions unrelated to the license terms is a violation, or at least a rejection, of that social contract. I'm not fully committed to this position, though, and would welcome well-reasoned arguments to the contrary.

28 minutes ago | parent | prev [-]
[deleted]
dlev_pika an hour ago | parent | prev [-]

“No, no, what was she wearing?”

Xirdus 22 minutes ago | parent | next [-]

People who take steps in response to social contract being broken are the ones responsible for the steps they've taken, not the ones who break the social contract.

25 minutes ago | parent | prev [-]
[deleted]
hilariously an hour ago | parent | prev [-]

Its definitely the ones DDOSing websites while giving no attribution in any way to the original creators.

Gormo 28 minutes ago | parent [-]

DDOSing websites seems to be an unrelated problem, and one that has traditionally been solved through response throttling and IP blocking.

Attribution is often required even on MIT or BSD licenses where code is being redistributed, either in original or modified versions, but that would relate to this discussion only to the extent that one regards using LLMs whose training data included a certain bit of code as itself constituting redistribution of that specific code -- but that in turn is a very debatable premise which really ought to be argued for, and not merely argued upon as though it is already generally recognized as true.

hilariously 19 minutes ago | parent [-]

Why? You stole my stuff and now are pretending I need to argue for you to stop stealing it. It's a joke.