| ▲ | TFNA 6 hours ago |
| I’m a researcher who for years has been scanning my library’s holdings on my particular discipline for my own use, but also uploading the books to the shadow libraries for everyone else’s benefit. The revelation that LLMs are training on the shadow libraries has made me put a lot more effort into ensuring my scans are well-OCRed. The idea that I could eventually ask ChatGPT or whatever about obscure things in my field, and get useful output (of the "trust but verify" sort), is exciting. |
|
| ▲ | BrenBarn 5 hours ago | parent | next [-] |
| How about the idea that you might have to eventually pay an AI company a large amount of money to ask ChatGPT such a question, while the library itself has lost funding? |
| |
| ▲ | BugsJustFindMe 4 hours ago | parent | next [-] | | Library funding is a political stance that has only imaginary connection to whether people pay to ask things of ChatGPT. People can pay to talk to an AI and also government can fund libraries. | | |
| ▲ | bakugo 2 hours ago | parent [-] | | Do you believe it makes sense for the government to fund libraries that almost nobody uses because they'd rather ask ChatGPT? | | |
| ▲ | indigo945 an hour ago | parent [-] | | People are already not using libraries because they'd rather rot their brains on TikTok than read a book. (Also, for information lookup, the internet and search engines exist, and have for a while now.) This has no actual causal relation. |
|
| |
| ▲ | TFNA 5 hours ago | parent | prev | next [-] | | Some people might have to pay a large amount of money to ask a commercial LLM, but advances in this space mean that if I have the data myself on my own computer, or can download it from a shadow library, I might eventually be able to ask everything locally for free. > while the library itself has lost funding Libraries are inherent parts of universities. While their precise role evolves, do you think that they will just be done away with? Already a substantial amount of scholarship in disciplines other than my own has moved online (legally), and the library is still there. | |
| ▲ | 4 hours ago | parent | prev | next [-] | | [deleted] | |
| ▲ | spoaceman7777 4 hours ago | parent | prev | next [-] | | Free, downloadable AI models have consistently caught up to ChatGPT within 3 months, for almost a year now. I highly encourage you to go and update your priors. | | |
| ▲ | roygbiv2 an hour ago | parent [-] | | And how much does the hardware cost to run said models? | | |
| ▲ | dboreham 31 minutes ago | parent | next [-] | | You can run them slowly on any machine that has enough memory. | |
| ▲ | fragmede 24 minutes ago | parent | prev [-] | | How good do you want it to be? For a close to ChatGPT today (April, 2026), you're still looking at a system with 7xH200+chassis, which will run you $300, or a GB200 NV72, which is $2-3 million. OTOH, a Qwen3.6 quantized model can be run on $10,000 (high end Mac) or $1,000 (Mac mini) worth of hardware. Even a Pixel 10 Pro cellphone ($1,000) can run useful models locally. |
|
| |
| ▲ | woctordho 3 hours ago | parent | prev | next [-] | | A digital library needs almost no funding. With today's decentralized networking infrastructure such as BitTorrent and IPFS I bet it just exists forever. | | |
| ▲ | x-complexity 3 hours ago | parent | next [-] | | > A digital library needs almost no funding. Clarification: To maintain the library still requires resources & effort to do so. It only appears to need no funding because the donators of said (disk space / bandwidth / dev effort) are subsidizing it in aid of a goal they believe in (i.e. the church model). | |
| ▲ | tardedmeme 3 hours ago | parent | prev [-] | | How much of Anna's Archive are you seeding? | | |
| |
| ▲ | protocolture 3 hours ago | parent | prev | next [-] | | How about the idea that one day you might be paying a subscription to use a service while non sequitur. | |
| ▲ | locknitpicker 4 hours ago | parent | prev [-] | | > How about the idea that you might have to eventually pay an AI company a large amount of money to ask ChatGPT such a question, while the library itself has lost funding? There are plenty of free models with RAG support. Why do you believe everything starts and ends with a major corporation charging a subscription? |
|
|
| ▲ | altmanaltman 4 hours ago | parent | prev | next [-] |
| How is any of that legal? Can you just take books from the library and then scan and upload digital copies? How do you deal with the ethics of this personally, stealing to make it easier for AI to steal so AI gets better? Does calling yourself a "researcher" make you feel like its actually something worthwhile you're doing? |
| |
| ▲ | x-complexity 3 hours ago | parent | next [-] | | > How do you deal with the ethics of this personally, stealing to make it easier for AI to steal so AI gets better? If the obscure book/text is permanently lost forever under your stringent advice of "no stealing under any circumstances", would the "stealing" have saved it? If so, is it ethical to prevent others from accessing the book/text, under your guise of "preventing stealing"? | |
| ▲ | GaryBluto 3 hours ago | parent | prev | next [-] | | > How do you deal with the ethics of this personally, stealing to make it easier for AI to steal so AI gets better? By quoting your comment in my reply, have I "stolen" your comment? | | |
| ▲ | fragmede 2 hours ago | parent [-] | | By reading this comment you have entered into a legal contract, by which you owe me $5. Failure to pay will be reported to the Internet police. |
| |
| ▲ | granabluto an hour ago | parent | prev | next [-] | | First, it's called infringement, not stealing. It's a custom defined term in a custom defined law. Second, it is totally legal to read the book in a public library, for free, right now. Third, laws can change. Current copyright law was pushed by one company (Disney) to +90years, to their benefit, and can be redesigned/pushed back by AI companies, for their benefit. A 2 year copyright duration sounds like a good compromise. | |
| ▲ | TFNA 4 hours ago | parent | prev | next [-] | | As a researcher, the main worthwhile thing that I am doing is publishing research, but having all this prior scholarship at hand 24/7 definitely makes it easier to produce said publications. And if I have created a scan, why not help out my colleagues, too? "Deal with the ethics", seriously? You might want to learn about how heavily shadow libraries are used across academia now. It’s no longer just disadvantaged scholars in the developing world relying on pirated scans because they don’t have good libraries. It’s increasingly everyone everywhere, because today’s shadow libraries can be faster and more convenient than even one’s own institution’s holdings. At conferences, if the presenter mentions a particularly interesting publication, you can sometimes watch several people in the room immediately open LibGen or Anna’s Archive on their laptop to download it right there and then. | | |
| ▲ | SomaticPirate 4 hours ago | parent [-] | | I know researchers want their work to be as widely viewed as possible and I understand that. But I have friends who used to self publish some small esoteric fiction. This commonplace theft has basically made them stop their work because the investment of their time is better in literally any other area. Thankfully though, tools like Grok and other LLMs will allow us to create slop fiction (/s) | | |
| ▲ | vidarh 3 hours ago | parent | next [-] | | The vast majority of writers do not recoup their investment, not due to piracy but due to a massive glut of works available. I've published a couple of novels. They've sold far better than average, and yet not sold enough to be remotely worth it if I did it for the money. Piracy might have made a tiny dent, but the many millions of competing novels matters far more. Anyone who has self published will have experienced that it is hard to even get people to read (as opposed to just download to hoard) your work even for free. It's more comfortable to blame piracy, though. | | | |
| ▲ | reacweb 3 hours ago | parent | prev | next [-] | | I think the current intellectual property system is flawed. Books are knowledge, and we shouldn't be able to limit the spread of knowledge. I imagine that books could be sold at the cost of printing, and there could be a QR code inside so that readers could freely donate money to the author if they enjoyed the book. Strangely enough, I imagine that with such a system, authors would be better paid. | |
| ▲ | vintagedave an hour ago | parent | prev [-] | | > But I have friends who used to self publish some small esoteric fiction. This commonplace theft has basically made them stop If you're writing for money, maybe. If you're writing for the love of writing, it won't. More, you hear of authors who encourage their books to be made available without DRM, who know or silently encourage their books to end up on torrent / library sites. They want their books to be read. |
|
| |
| ▲ | subscribed an hour ago | parent | prev | next [-] | | It's not stealing, it's uploading without the licence. Laws in many countries allow for the lawful download of such books, regardless of how they were uploaded. Separately, aren't always sensible or right - slavery was legal, child marriage was legal, not paying taxes on billions of profits is legal while not paying taxes of £1000 is illegal, reporting Jews to Nazis was mandatory, etc, etc. | |
| ▲ | tardedmeme 3 hours ago | parent | prev | next [-] | | AI training is legal because the supreme court said so. | |
| ▲ | woctordho 3 hours ago | parent | prev | next [-] | | Copyright is a property right, and property right is what we call a bourgeois legal right. It will cease to exist as productive force like AI develops. | |
| ▲ | felooboolooomba 3 hours ago | parent | prev | next [-] | | > How is any of that legal? He didn't mention legality. The world is rigged, as you can see by head of state participating in both in running and cover up of history's largest CSE. Watch what people are doing in addition to what they are saying. I for one am tremendously thankful for TFNA's efforts, since I get access to knowledge that I wouldn't have been able to before. | |
| ▲ | __alexs 3 hours ago | parent | prev | next [-] | | You can't steal information don't be silly. You can just not have permission to copy it. Oh no. | |
| ▲ | redsocksfan45 13 minutes ago | parent | prev [-] | | [dead] |
|
|
| ▲ | Papazsazsa 4 hours ago | parent | prev | next [-] |
| [flagged] |
| |
| ▲ | red75prime 2 hours ago | parent | next [-] | | The ridiculously long "70 years after the author's death" makes it highly problematic in many cases. | |
| ▲ | TFNA 4 hours ago | parent | prev | next [-] | | Of course not, and many authors are already long dead. But if you knew anything about academic publishing, the authors almost invariably are happy to see their work out there freely available. It’s not as if they make any money from it, and the more eyes on their work, the better their chances of getting cited and thereby furthering their careers. It is some publishers who would object on copyright grounds. But I get the sense that some publishers are already becoming resigned to the fact that most of their new ebook releases are ending up on the shadow libraries within only a few weeks, and Anna’s Archive has become the first place to look (even before one looks at whether one’s own institutional library has the book) for researchers around the world. | | | |
| ▲ | ddtaylor 4 hours ago | parent | prev | next [-] | | Why assume people lock knowledge in a box and charge for access? | |
| ▲ | 4 hours ago | parent | prev | next [-] | | [deleted] | |
| ▲ | nullsanity 4 hours ago | parent | prev [-] | | [dead] |
|
|
| ▲ | emsign 3 hours ago | parent | prev [-] |
| That's a slave mentality. You are aware that OpenAI charges money for other people's work and intelligence, right? Your own and that of other volunteer pirates and of the original authors as well. I don't get people like you at all. |
| |
| ▲ | TFNA 3 hours ago | parent | next [-] | | I’ve already posted in this thread about how even if OpenAI charges money for its LLM trained on the literature, that doesn’t change the fact that the literature remains available to everyone through the shadow libraries, and advances in AI mean that one can increasingly work with it locally on one’s own computer. | |
| ▲ | __alexs 3 hours ago | parent | prev [-] | | Open weight models exist and are critical to us avoiding a future where you have to pay sama a slice of every engineers salary. |
|