| ▲ | nialse 2 days ago |
| Paranoia. And also ironic considering their base LLM is a distillation of the web and books etc etc. |
|
| ▲ | petcat 2 days ago | parent | next [-] |
| They stole everything and now they want to close the gates behind them. "I got the loot, Steve!" I feel like the distillation stuff will end up in court if they try to sue an American company about it. We'll see what a judge says. |
| |
| ▲ | arcfour 2 days ago | parent | next [-] | | You're perfectly free to scrape the web yourself and train your own model. You're not free to let Anthropic do that work for you, because they don't want you to, because it cost them a lot of time and money and secret sauce presumably filtering it for quality and other stuff. Stole? Courts have ruled it's transformative, and it very obviously is. AI doomerism is exhausting, and I don't even use AI that much, it's just annoying to see people who want to find any reason they can to moan. | | |
| ▲ | petcat 2 days ago | parent | next [-] | | > Stole? Courts have ruled it's transformative, and it very obviously is. The courts have ruled that AI outputs are not copyrightable. The courts have also ruled that scraping by itself is not illegal, only maybe against a Terms of Service. Therefore, Anthropic, OpenAI, Google, etc. have no legal claim to any proprietary protections of their model outputs. So we have two things that are true: 1) Anthropic (certainly) violated numerous TOS by scraping all of the internet, not just public content. 2) Scraping Anthropic's model outputs is no different than what Anthropic already did. Only a TOS violation. | | |
| ▲ | dpark 2 days ago | parent | next [-] | | > 2) Scraping Anthropic's model outputs is no different than what Anthropic already did. Only a TOS violation. Regardless of whether LLM training amounts to theft, thieves are still allowed to put locks on their own doors. | |
| ▲ | gruez 2 days ago | parent | prev [-] | | >The courts have ruled that AI outputs are not copyrightable. "not copyrightable" doesn't imply they can't frustrate attempts to scrape data. | | |
| ▲ | petcat 2 days ago | parent [-] | | Nobody is saying they can't try to stop you themselves. That's where the Terms of Service violation part comes in. They can cancel your account, block your IP, etc. They just can't legally stop you by, for instance, compelling a judge to order you to stop. | | |
| ▲ | dpark 2 days ago | parent [-] | | > They just can't legally stop you by, for instance, compelling a judge to order you to stop. They probably can, actually. TOS are legally binding. More likely they would block you rather than pursuing legal avenues but they certainly could. | | |
| ▲ | petcat 2 days ago | parent [-] | | The Supreme Court already ruled on this. Scraping public data, or data that you are authorized to access, is not a violation of the Computer Fraud and Abuse Act. Now, if you try to get around attempts to block your access, then yes you could be in legal trouble. But that's not what is happening here. These are people/companies that have Claude accounts in good standing and are authorized by Anthropic to access the data. Nobody is saying that Anthropic can't just block them though, and they are certainly trying. | | |
| ▲ | dpark 2 days ago | parent [-] | | I didn’t say anything about the computer fraud and abuse act. TOS are legally binding contracts in their own right if implemented correctly. |
|
|
|
|
| |
| ▲ | alpha_squared 2 days ago | parent | prev | next [-] | | > You're perfectly free to scrape the web yourself and train your own model. Actually, not anymore as a result of OpenAI and Anthropic's scraping. For example, Reddit came down hard on access to their APIs as a response to ChatGPT's release and the news that LLMs were built atop of scraping the open web. Most of the web today is not as open as before as a result of scraping for LLM data. So, no, no one is perfectly free to scrape the web anymore because open access is dying. | |
| ▲ | two_tasty 2 days ago | parent | prev | next [-] | | "...free to scrape the web yourself and train your own model." Yes, rich and poor are equally forbidden from sleeping under bridges. | | |
| ▲ | kspacewalk2 2 days ago | parent [-] | | Meaning what? The poor gets to sleep in the guest room of the rich guy's house because muh inequality? Anthropic paid a lot of money for a moat and want to guard it. It is not wrong, in any sense of the word, for them to do so. | | |
| ▲ | salawat 2 days ago | parent [-] | | Rich people aren't going to find themselves needing to sleep under a bridge, so the law really only exists as a constraint on the poor. Duh. The flex that "well a rich guy couldn't do it either" is A) at best a myopic misunderstanding perpetuated by out of touch people and B) hopelessly naive, because anny punishment for the rich guy actually sleeping under a bridge is so laughably small it may as well not even exist. Hence, the whole bit of "a legal system to keep these accountable, but not for me". | | |
| ▲ | kspacewalk2 2 days ago | parent | next [-] | | Okay, you explained what Anatole France meant, which is probably helpful for those few who didn't get it from the quote itself. Perhaps now you can explain what on earth this has to do with Anthropic not wanting to let other for-profit businesses mooch off its investment of time, brainpower and money? | |
| ▲ | dpark 2 days ago | parent | prev [-] | | You explained what “rich and poor are equally forbidden from sleeping under bridges” means, but not what this has to do with the statement that one is free to do their own scraping and training, which I’m pretty sure is what kspacewalk was asking. |
|
|
| |
| ▲ | jtbayly 2 days ago | parent | prev | next [-] | | Wut?They did exactly the same thing! Try this: If you want to train a model, you’re free to write your own books and websites to feed into it. You’re not free to let others do that work for you because they don’t want you to, because it cost them a lot of time and money and secret sauce presumably filtering it for quality and other stuff. | | |
| ▲ | arcfour 2 days ago | parent [-] | | [flagged] | | |
| ▲ | buzzerbetrayed 2 days ago | parent [-] | | [flagged] | | |
| ▲ | jollymonATX 2 days ago | parent [-] | | Yeah these folks skin is often very thin. One poke too hard and it's "whatever" and them scuttling off. Really hope there is a day they introspect. | | |
| ▲ | arcfour 2 days ago | parent [-] | | I introspect all the time. I just disagree with you so I have thin skin? Lol. I think it's transformative. I also think that it's a net positive for society. I lastly think that using freely available, public information is totally fair game. Piracy not so much, but it's water under the bridge. I hope you introspect some day, too, and realize it's acceptable for people to have different views than you. That's why I don't care; you aren't going to change my mind and I can't change yours either, so it's moot and I don't care to argue about it further. | | |
| ▲ | jollymonATX 2 days ago | parent [-] | | You had appeared to scuttle off but alas I was wrong (and sorry to imply you are a crab of some sort) however your comment followup on not changing minds might be a tad shell-ish. I'm open minded actually on the issue and these are major issues of our time. I'm personally impacted by this and it does make me wonder "will I write X thing again" and it is a very hard question to answer frankly. When you see your works presented in summary on search and a major decline in traffic you really do think about that. It impacts my ability to make money as I once did prior to 2024 (when it really hit) without doubt. Edit/spelling |
|
|
|
|
| |
| ▲ | airstrike 2 days ago | parent | prev | next [-] | | Guess who else spent a lot of time and money and secret sauce? Do you hear the words coming out of your mouth? | |
| ▲ | nunez 2 days ago | parent | prev | next [-] | | Lol; like heck we are. Try scraping the NYTimes at LLM scale. You can time how quickly you’ll get 420’ed or, at worst, hit with a C&D. | | | |
| ▲ | andersonpico 2 days ago | parent | prev | next [-] | | Your selective respect for work is a glaring double standard. The effort to produce the original content they scraped is order of magnitudes bigger than what it took to train the model, so if this wasn't enough to protect the authors from Anthropic it shouldn't be enough to protected Anthropic from people distillating their models. Your legal argument is all over the place as well. What is more relevant here: what the courts ruled or what you consider obvious? How is distillation less transformative than scraping? How does courts ruling that scraping to train models is legal relate to distillation? Nobody is scoring you on neutrality points for not using AI much and calling this doomerism is just a thought-terminating cliche that refuses to engage with the comment you're replying. In fact, your comment is not engaging with anything at all, you're vaguely gesturing towards potentitial arguments without making them. If you find discussing this exhausting then don't but also don't flood the comments with low effort whining. | |
| ▲ | hax0ron3 2 days ago | parent | prev | next [-] | | It is transformative, but if I make a bunch of requests to their API and use the responses to distill my own model, that is also transformative. | |
| ▲ | loremium 2 days ago | parent | prev | next [-] | | reminds me of `don't look up` a bit. there clearly is an imbalance in regards to licenses with model providers, not even talking about knowledge extraction (yes younger people don't learn properly now, older generations forget) shortly before the rug-pull happens in form of accessibility to not rich people | |
| ▲ | unethical_ban 2 days ago | parent | prev [-] | | Let's talk ethics, not law. Why is it okay for these companies to pirate books and scrape the entire web and offer synthesized summaries of all of it, lowering traffic and revenue for countless websites and professions of experts, but it is not okay for others to try to do the same to an AI model? Is the work of others less valid than the work of a model? | | |
| ▲ | gruez 2 days ago | parent | next [-] | | >Why is it okay for these companies to pirate books Courts have ruled it's not, and I don't think anyone is arguing it's okay. >but it is not okay for others to try to do the same to an AI model? The steelman version is that it's okay to do it once you acquired the data somehow, but that doesn't mean anthropic can't set up roadblocks to frustrate you. | |
| ▲ | p1esk 2 days ago | parent | prev | next [-] | | I don’t see why it’s not ok to do that to an AI model. Or are you asking why they don’t want you to do it? | |
| ▲ | sfn42 2 days ago | parent | prev [-] | | I don't think anyone's saying it's not okay - I think the point is that Anthropic has every right to create safeguards against it if they want to - just like the people publishing other information are free to do the same. And everyone is free to consume all the free information. |
|
| |
| ▲ | olalonde 2 days ago | parent | prev | next [-] | | Also, begging to get "regulated": https://x.com/TheChiefNerd/status/2038565951268946021 | |
| ▲ | Andrex 2 days ago | parent | prev | next [-] | | I just rewatched that scene last night on YouTube. Maybe this is the universe telling me to watch the whole movie again... It's cool to see Noah Wyle getting his due these days (The Pitt). | |
| ▲ | decremental 2 days ago | parent | prev | next [-] | | [dead] | |
| ▲ | dpark 2 days ago | parent | prev [-] | | [flagged] | | |
| ▲ | petcat 2 days ago | parent | next [-] | | Not every lawsuit goes to court, or results in a decision. I'm sure you know that. | | |
| ▲ | dpark 2 days ago | parent [-] | | You should ask Claude what a lawsuit is. Or perhaps you mean “trial” and not “court”? | | |
| ▲ | cryptonector 2 days ago | parent | next [-] | | Not every lawsuit that is heard by a court goes to trial. | | | |
| ▲ | petcat 2 days ago | parent | prev [-] | | I feel like I'm talking to Claude right now. Am I? | | |
| ▲ | dpark 2 days ago | parent [-] | | This is the new “I can’t defend my statement online” retort, huh? “Well I might be wrong but at least I’m not AI like YOU!” | | |
| ▲ | petcat 2 days ago | parent | next [-] | | Ever heard the phrase "settled out of court"? A lot of lawsuits are settled even before a court clerk processes the paperwork. | | |
| ▲ | squeaky-clean 2 days ago | parent | next [-] | | Settled out of court does not mean the lawsuit never went to court. It means the settlement happened outside of court. Every lawsuit has to go to court, that's how you file a lawsuit. If it isn't sent to a court it's just words in a document. | |
| ▲ | dpark 2 days ago | parent | prev [-] | | A lawsuit with no paperwork filed is not a lawsuit. That’s just an agreement. Again, you seem to be conflating lawsuit with trial. |
| |
| ▲ | 2 days ago | parent | prev [-] | | [deleted] |
|
|
|
| |
| ▲ | v8xi 2 days ago | parent | prev [-] | | next you should explain idioms |
|
|
|
| ▲ | sheept 2 days ago | parent | prev | next [-] |
| It's not really paranoia if it's happening a lot. They wrote a blog post calling several major Chinese AI companies out for distillation.[0] Perhaps it is ironic, but it's within their rights to protect their business, like how they prohibit using Claude Code to make your own Claude Code.[1] [0]: https://www.anthropic.com/news/detecting-and-preventing-dist...
[1]: https://news.ycombinator.com/item?id=46578701 |
| |
| ▲ | gmerc 2 days ago | parent | next [-] | | And conveniently left out they themselves distilled DeepSeek for chinese content into their model.... | |
| ▲ | salawat 2 days ago | parent | prev [-] | | Their business shouldn't exist. It was predisposed on non-permissive IP theft. They may have found a judge willing to cop to it not being so, but the rest of the public knows the real score. And most problematically for them, that means the subset of hackerdom that lives by tit-for-tat. One should beware of pissing off gray-hats. Iit's a surefire way to find yourself heading for bad times. |
|
|
| ▲ | jaccola 2 days ago | parent | prev | next [-] |
| I would say not all that ironic. Book publishers, Reddit, Stackoverflow, etc., tried their best to attract customers while not letting others steal their work. Now Anthropic is doing the same. Unfortunately (for the publishers, at least) it didn't work to stop Anthropic and Anthropic's attempts to prevent others will not work either; there has been much distillation already. The problem of letting humans read your work but not bots is just impossible to solve perfectly. The more you restrict bots, the more you end up restricting humans, and those humans will go use a competitor when they become pissed off. |
| |
| ▲ | brookst 2 days ago | parent [-] | | It's really just tech culture like HN that obsesses over solving problems perfectly. From seat belts to DRM to deodorant, most of the world is satisfied with mitigating problems. |
|
|
| ▲ | johnfn 2 days ago | parent | prev | next [-] |
| It is absolutely not paranoia. People are distilling Claude code all the time. |
|
| ▲ | spiderfarmer 2 days ago | parent | prev [-] |
| That isn't irony, it's hypocrisy. |
| |
| ▲ | snapcaster 2 days ago | parent | next [-] | | No it isn't. It's a competition, making moves that benefit you and attempting to deprive your opponent of the same move is just called competing | | |
| ▲ | brookst 2 days ago | parent | next [-] | | Wait, are you saying that it's not hypocritical for my chess opponent to try to protect their king while trying to kill mine? :mind-blown: Tech people are funny, with these takes that businesses do/should adhere to absolute platonic ideals and follow them blindly regardless of context. | | |
| ▲ | salawat 2 days ago | parent [-] | | No, it's ethical people pointing out that if you toss aside ethics for success at all costs, you aren't going to find any sympathy when people start doing the same thing back to you. Live by the sword, die by the sword, as they say. There is a reason we don't do things. That reason is it makes the world a worse place for everyone. If you are so incredibly out of touch with any semblance of ethics at all; mayhaps you are just a little bit part of the problem. | | |
| ▲ | brookst 2 days ago | parent [-] | | The funny thing about ethics is there is no absolute, which makes some people uncomfortable. Is it ethical to slice someone with a knife? Does it depend if you're a surgeon or not? Absolutism + reductionism leads to this kind of nonsense. It is possible that people can disagree about (re)use of culture, including music and print. Therefore it is possible for nuance and context to matter. Life is a lot easier if you subscribe to a "anyone who disagrees with me on any topic must have no ethics whatsoever and is a BAD person." But it's really not an especially mature worldview. | | |
| ▲ | salawat 2 days ago | parent [-] | | Categorical imperative and Golden Rule, or as you may know it from game theory "tit-for-tat" says "hi". The beautiful thing about ethics is that we philosophers intentionally teach it descriptively, but encourage one to choose their own based on context invariance. What this does is create an effective litmus test for detecting shitty people/behavior. You grasping on for dear life to "there's no absolutes" is an act of self-soothing on your own part as you're trying to rationalize your own behavior to provide an ego crumple zone. I, on the other hand, don't intend to leave you that option. That you're having to do it is a Neon sign of your own unethicality in this matter. We get to have nice things when people moderate themselves (we tolerate eventual free access to everything as long as the people who don't want to pay for it don't go and try to replace us economically at scale). When people abuse that, (scrape the Internet, try to sell work product in a way that jeopardizes the environment we create in) the nice thing starts going away, and you've made the world worse. Welcome to life bucko. Stop being a shitty person and get with the program so we have something to leave behind that has a chance of not making us villains in the eyes of those we eventually leave behind. The trick is doing things the harder way because it's the right way to do it. Not doing it the wrong way because you're pretty sure you can get away with it. But you're already ethically compromised, so I don't really expect this to do any good except to maybe make the part of you you pointedly ignore start to stir assuming you haven't completely given yourself up to a life of ne'er-do-wellry. Enjoy the enantidromia. Failing that, karma's a bitch. | | |
| ▲ | spiderfarmer a day ago | parent [-] | | Whenever I see someone on HN preaching about how it's all dog-eat-dog and zero-sum, I imagine them being lonely. No real friends, no trusted life partner, no kids, no unconditional love. Alone. Just another soul traveling on an infinite road with lots of signs that point to "happiness," planted there by fellow travelers, never reaching their destination. |
|
|
|
| |
| ▲ | vor_ 2 days ago | parent | prev [-] | | It's definitely still hypocrisy. | | |
| ▲ | snapcaster 20 hours ago | parent [-] | | No it isn't, have you never played any game or sport? do you fundamentally not understand the concept of competition? |
|
| |
| ▲ | keybored 2 days ago | parent | prev | next [-] | | The Golden Horde didn’t want opponents to conquer their territory. An irony if you think about it— | |
| ▲ | croes 2 days ago | parent | prev [-] | | That’s capitalism | | |
| ▲ | dmix 2 days ago | parent | next [-] | | As opposed to the rent-seeking copyright industry where 1% goes to the original creators if you're lucky. | | |
| ▲ | jitl 2 days ago | parent [-] | | That’s capitalism too | | |
| ▲ | dmix 2 days ago | parent [-] | | Technically state-capitalism since it's an industry created as a result of congress regulating commerce with aggressive IP laws (aka rent-seeking) | | |
| ▲ | brookst 2 days ago | parent [-] | | Where can I see an example of any other kind of capitalism? | | |
| ▲ | dmix a day ago | parent [-] | | Capitalism is always underpinned by a strong legal system which is why most criticism is about constraining growth in legislation, not killing off interference outright. Copyright law is a good example of a law that made sense in it's original form but turned into a monster with scope-creep. Although, if we're being realpolitik, every time government interference grows in scope and corrupts markets, capitalism still gets blamed and people call for more government to fix it (see: housing). So the capitalism vs state capitalism distinction isn't very meaningful in practice. |
|
|
|
| |
| ▲ | satvikpendem 2 days ago | parent | prev [-] | | As opposed to what economic system that doesn't do this? |
|
|