Remix clone Hacker News

new | show | ask | jobs Github

	▲	kouteiheika 2 hours ago
		> Which is why I was proposing a middle-ground where an agreement is setup between the model training company and the collective of developers/artists et all and come up with a license agreement where they are rewarded for their original work for perpetuity. A tiny % of the profits can be shared, which would be a form of UBI. This is fair That wouldn't be fair because these models are not only trained on code. A huge chunk of the training data are just "random" webpages scraped off the Internet. How do you propose those people are compensated in such a scheme? How do you even know who contributed, and how much, and to whom to even direct the money? I think the only "fair" model would be to essentially require models trained on data that you didn't explicitly license to be released as open weights under a permissive license (possibly with a slight delay to allow you to recoup costs). That is: if you want to gobble up the whole Internet to train your model without asking for permission then you're free to do so, but you need to release the resulting model so that the whole humanity can benefit from it, instead of monopolizing it behind an API paywall like e.g. OpenAI or Anthropic does. Those big LLM companies harvest everyone's data en-masse without permission, train their models on it, and then not only they don't release jack squat, but have the gall to put up malicious explicit roadblocks (hiding CoT traces, banning competitors, etc.) so that no one else can do it to them, and when people try they call it an "attack"[1]. This is what people should be angry about. [1] -- https://www.anthropic.com/news/detecting-and-preventing-dist...
	▲	duskdozer an hour ago \| parent [-]
		>under a permissive license well, assuming all data that is itself not permissively licensed is excluded