| |
| ▲ | Xelynega 7 hours ago | parent | next [-] | | I don't understand what the "hacker ethos" could have to do with defending openai's blatant stealing of people's content for their own profit. Openai is not sharing their data(they're keeping it private to profit off of), so how could it be anywhere near the "hacker ethos" to believe that everyone else needs to hand over their data to openai for free? | | |
| ▲ | CaptainFever 6 hours ago | parent [-] | | Following the "GNU-flavour hacker ethos" as described, one concludes that it is right for OpenAI to copy data without restriction, it is wrong for NYT to restrict others from using their data, and it is also wrong for OpenAI to restrict the sharing of their model weights or outputs for training. Luckily, most people seem to ignore OpenAI's hypocritical TOS against sharing their output weights for training. I would go one step further and say that they should share the weights completely, but I understand there's practical issues with that. Luckily, we can kind of "exfiltrate" the weights by training on their output. Or wait for someone to leak it, like NovelAI did. |
| |
| ▲ | AlienRobot 8 hours ago | parent | prev | next [-] | | I think an ethical hacker is someone who uses their expertise to help those without. How could an ethical hacker side with OpenAI, when OpenAI is using its technological expertise to exploit creators without? | | |
| ▲ | CaptainFever 8 hours ago | parent [-] | | I won't necessarily argue against that moral view, but in this case it is two large corporations fighting. One has the power of tech, the other has the power of the state (copyright). So I don't think that applies in this case specifically. | | |
| ▲ | Xelynega 7 hours ago | parent [-] | | Aren't you ignoring that common law is built on precedent? If they win this case, that makes it a lot easier for people who's copyright is being infringed on an individual level to get justice. | | |
| ▲ | CaptainFever 6 hours ago | parent [-] | | You're correct, but I think many don't realize how many small model trainers and fine-tuners there are currently. For example, PonyXL, or the many models and fine-tunes on CivitAI made by hobbyists. So basically the reasoning is this: - NYT vs OpenAI, neither is disenfranchied
- OpenAI vs individual creators, creators are disenfranchised
- NYT vs individual model trainers, model trainers are disenfranchised
- Individual model trainers vs individual creators, neither are disenfranchised And if only one can win, and since the view is that information should be free, it biases the argument towards the model trainers. | | |
| ▲ | AlienRobot 3 hours ago | parent [-] | | What "information" are you talking about? It's a text and image generator. Your argument is that it's okay to scrape content when you are an individual. It doesn't change the fact those individuals are people with technical expertise using it to exploit people without. If they wrote a bot to annoy people but published how many people got angry about it, would you say it's okay because that is information? You need to draw the line somewhere. |
|
|
|
| |
| ▲ | onetokeoverthe 8 hours ago | parent | prev | next [-] | | Creators freely sharing with attribution requested is different than creations being ruthlessly harvested and repurposed without permission. https://creativecommons.org/share-your-work/ | | |
| ▲ | a57721 3 hours ago | parent | next [-] | | > freely sharing with attribution requested If I share my texts/sounds/images for free, harvesting and regurgitating them omits the requested attribution. Even the most permissive CC license (excluding CC0 public domain) still requires an attribution. | |
| ▲ | CaptainFever 8 hours ago | parent | prev [-] | | > A few go further and assert that all information should be free and any proprietary control of it is bad; this is the philosophy behind the GNU project. In this view, the ideal world is one where copyright is abolished (but not moral rights). So piracy is good, and datasets are also good. Asking creators to license their work freely is simply a compromise due to copyright unfortunately still existing. (Note that even if creators don't license their work freely, this view still permits you to pirate or mod it against their wishes.) (My view is not this extreme, but my point is that this view was, and hopefully is, still common amongst hackers.) I will ignore the moralizing words (eg "ruthless", "harvested" to mean "copied"). It's not productive to the conversation. | | |
| ▲ | onetokeoverthe 7 hours ago | parent [-] | | If not respected, some Creators will strike, lay flat, not post, go underground. Ignoring moral rights of creators is the issue. | | |
| ▲ | CaptainFever 6 hours ago | parent [-] | | Moral rights involve the attribution of works where reasonable and practical. Clearly doing so during inference is not reasonable or practical (you'll have to attribute all of humanity!) but attributing individual sources is possible and is already being done in cases like ChatGPT Search. So I don't think you actually mean moral rights, since it's not being ignored here. But the first sentence of your comment still stands regardless of what you meant by moral rights. To that, well... we're still commenting here, are we not? Despite it with almost 100% certainty being used to train AI. We're still here. And yes, funding is a thing, which I agree needs copyright for the most part unfortunately. But does training AI on, for example, a book really reduce the need to buy the book, if it is not reproduced? Remember, training is not just about facts, but about learning how humans talk, how languages work, how books work, etc. Learning that won't reduce the book's economical value. And yes, summaries may reduce the value. But summaries already exist. Wikipedia, Cliff's Notes. I think the main defense is that you can't copyright facts. | | |
| ▲ | onetokeoverthe 3 hours ago | parent [-] | | we're still commenting here, are we not? Despite it with almost 100% certainty being used to train AI. We're still here ?!?!
Comparing and equating commenting to creative works. ?!?! These comments are NOT equivalent to the 17 full time months it took me to write a nonfiction book. Or an 8 year art project. When I give away my work I decide to whom and how. |
|
|
|
| |
| ▲ | ysofunny 8 hours ago | parent | prev [-] | | oh please, then, riddle me why does my comment has -1 votes on "hacker" news which has indeed turned into "i-am-rich-cuz-i-own-tech-stock"news | | |
| ▲ | alwa 7 hours ago | parent | next [-] | | I did not contribute a vote either way to your comment above, but I would point out that you get more of what you reward. Maybe the reward is monetary, like an author paid for spending their life writing books. Maybe it’s smaller, more reputational or social—like people who generate thoughtful commentary here, or Wikipedia’s editors, or hobbyists’ forums. When you strip people’s names from their words, as the specific count here charges; and you strip out any reason or even way for people to reward good work when they appreciate it; and you put the disembodied words in the mouth of a monolithic, anthropomorphized statistical model tuned to mimic a conversation partner… what type of thought is it that becomes abundant in this world you propose, of “data abundance”? In that world, the only people who still have incentive to create are the ones whose content has negative value, who make things people otherwise wouldn’t want to see: advertisers, spammers, propagandists, trolls… where’s the upside of a world saturated with that? | |
| ▲ | CaptainFever 8 hours ago | parent | prev [-] | | Yes, I have no idea either. I find it disappointing. I think people simply like it when data is liberated from corporations, but hate it when data is liberated from them. (Though this case is a corporation too so idk. Maybe just "AI bad"?) |
|
|