| ▲ | thisislife2 2 hours ago | ||||||||||||||||
The only solution is regulation. If all content created by anyone has a copyright, how does an implicit opt-in (which is what happens if you don't create a robots.txt file for your website) for scraping make any sense? Moreover, even if you have a robots.txt, AI (or whatever) bots often don't respect it (or use workarounds - they outsource scraping of such "restricted" sites to unethical third-parties to get the data; Meta has even resorted to piracy, openly!). So clearly, the logic and the "honour system" has failed. Cloudflare, Google Captcha, HCaptcha etc. are all shitty technical solutions because, as we are all discovering, it comes at the cost of our privacy (i.e. our personal data may monetise these services) and / or our computing resource and time. If current copyright laws aren't sufficient to prevent this, we have to acknowledge the system is broken. The answer could be enhancing it with some kind of Digital Millennium Copyright Act (DMCA) -like laws, but in favour of the creators against BigTech or rogue actors. - Web-scraping and copyright law - https://www.neudata.co/blog/web-scraping-and-copyright-law - Why DMCA Claims Against Web Scrapers Face Long Odds - https://capstonedc.com/insights/why-dmca-claims-against-web-... | |||||||||||||||||
| ▲ | oceanplexian 2 hours ago | parent | next [-] | ||||||||||||||||
Or you could let information be free, at least the stuff that’s on the public net. As for issues like bots overloading websites or using too many resources scaling laws will take care of it quickly, it’s not like you can’t serve thousands of RPS from a Raspberry Pi these days. | |||||||||||||||||
| ▲ | ImPostingOnHN 2 hours ago | parent | prev [-] | ||||||||||||||||
I don't think regulation will stop web scraping, not least of which because it can be done from locations outside the jurisdiction of the regulations. > we have to acknowledge the system is broken The system is broken. It probably takes, what, 10 seconds or less to use a residential or foreign proxy, 6+ months to internationally track and prosecute a single offender? So like a million times more effort going the regulatory route. | |||||||||||||||||
| |||||||||||||||||