Remix.run Logo
tpmoney 4 hours ago

I'll propose my pie in the sky plan here again. We should overhaul the copyright system completely in light of AI and make it mostly win-win for everyone. This is predicated on the idea that the NIST numbers set is sort of the "hello world" dataset for people wanting to learn machine vision and having that common data set is really handy. Numbers made up off the top of my head/subject to tuning but the basic idea is this:

1) Cut copyright to 15-20 years by default. You can have 1 extension of an additional 10-15 years if you submit your work to the "National Data Set" within say 2-3 years of the initial publication.

2) Content in the National set is well categorized and cleaned up. It's the cleanest data set anyone could want. The data set is used both to train some public models and also licensed out to people wanting to train their own models. Both the public models and the data sets are licensed for nominal fees.

3) People who use the public models or data sets as part of their AI system are granted immunity from copyright violation claims for content generated by these models, modulo some exceptions for knowing and intentional violations (e.g. generating the contents of a book into an epub). People who choose to scrape their own data are subject to the current state of the law with regards to both scraping and use (so you probably better be buying a lot of books).

4) The license fees generated from licensing the data and the models would be split into royalty payments to people whose works are in the dataset, and are still under copyright protection, proportional to the amount of data submitted and inversely proportional to the age of that data. There would be some absolute caps in place to prevent slamming the national data sets with junk data just to pump the numbers.

Everyone gets something out of this. AI folks get clean data, that they didn't have to burn a lot of resources scraping. Copyright holders get paid for their works used by AI and retain most of the protections they have today, just for a shorter time), the public gets usable AI tooling without everyone spending their own resources on building their own data sets, site owners and the like get reduced bot/scraping traffic. It's not perfect, and I'm sure the devil is in the details, but that's the nature of this sort of thing.

mschuster91 4 hours ago | parent [-]

> Cut copyright to 15-20 years by default.

This alone will kill off all chances of that ever passing.

Like, I fully agree with your proposal... but I don't think it's feasible. There are a lot of media IPs/franchises that are very, very old but still generate insane amounts of money to this day with active developments. Star Wars and Star Trek obviously, but also stuff like the MCU or Avatar is on its best way to two decades of runtime, Iron Man 1 was released in 2008, or Harry Potter which is almost 30 years old. That's dozens of billions of dollars in cumulative income, and most of that is owned by Disney.

Look what it took to finally get even the earliest Disney movies to enter the public domain, and that was stuff from before World War 2 that was so bitterly fought over.

In order to reform copyright... we first have to use anti-trust to break up the large media conglomerates. And it's not just Disney either. Warner, Sony, Comcast and Paramount also hold ridiculous amounts of IP, Amazon entered the fray as well with acquiring MGM (mostly famous for James Bond), and Lionsgate holds the rights for a bunch of smaller but still well-known IPs (Twilight, Hunger Games).

And that's just the movie stuff. Music is just as bad, although at least there thanks to radio stations being a thing, there are licensing agreements and established traditions for remixes, covers, tribute bands and other forms of IP re-use by third parties.