Remix.run Logo
Frieren 6 hours ago

> The LLM companies are not picking on me in particular, they are pounding every site on the net.

Why is not this a criminal offense? They are hurting business for profit (or for higher valuation as they probably have no profit at all).

Why are corporations allowed to do with impunity what could land even a teenager years in prison? Is there no rule of law anymore?

The five-year and ten-year penalties kick in only when the government can show the offense caused at least $5,000 in losses across all victims during a one-year period. https://legalclarity.org/what-are-the-punishments-for-a-ddos...

maplethorpe 5 hours ago | parent | next [-]

> Why are corporations allowed to do with impunity what could land even a teenager years in prison? Is there no rule of law anymore?

Those laws are intended to protect corporations. If corporations are the ones doing the scraping, it doesn't make sense for the same laws to affect them.

budududuroiu 6 hours ago | parent | prev | next [-]

Normative vs prerogative state [1]. See US v. Swartz compared to Meta use of LibGen for Llama

[1] https://en.wikipedia.org/wiki/Dual_state_(model)

dannyobrien 5 hours ago | parent [-]

So, I knew Aaron and I definitely would not presume to predict what he would have thought, but I’d point out there is a sizeable state space where he should never have been prosecuted, and scraping by others including large commercial companies should not prosecutable on the same grounds.

I repeat what Aaron’s friends and lawyers said at the time: we were going to fight that case, and we were going to win.

janmo 2 hours ago | parent | prev | next [-]

His robots.txt explicitly allows bots including LLM bots to scrape his site

tempest_ 6 hours ago | parent | prev | next [-]

Because might makes right and any entity with the power to legally put up a fight is in on the game (or wants to be)

heavyset_go 6 hours ago | parent | prev | next [-]

We've already established that computer crime and IP laws apply to normies and not tech companies

legohead 5 hours ago | parent | prev | next [-]

adapt or die

waiting on the govt to do something is a path of failure

spiderfarmer 5 hours ago | parent | prev | next [-]

I have added a DB replica server just to keep my website from succumbing to AI bot traffic.

will4274 5 hours ago | parent | prev | next [-]

It's a bit more like a physical business with a "public welcome" policy like a coffee shop going viral and then having tens of thousands of people walking in and taking pictures but not buying coffee. It's disruptive, but not illegal.

Acme.com is welcome to require authentication for all pages but their home page, which would quickly cause the traffic to drop. They don't want to do this - like the coffee shop, they want to be open to public, and for good reasons.

Sometimes the use profile changes dramatically in a short time. 15 years ago, Netflix created the video streaming market and shared bandwidth capacity that had been excessive before wasn't enough. 15 years before that, Google did the same thing when they created search and started driving tremendous traffic to text based websites which had spread through word of mouth before.

Turns out the micro transaction people probably had the right idea.

qmarchi 2 hours ago | parent [-]

Depends on the country. In Japan, you could be considered a "public nusicance" and be tossed behind bars for a bit.

devmor 4 hours ago | parent | prev | next [-]

Because they have more money.

I've had to deploy a combination of Cloudflare's bot protection and anubis on over 200 domains across 8 different hosting environments in the last 2 months. I have small business clients that couldn't access their sales and support platforms because their websites that normally see tens of thousands of unique sessions per day are suddenly seeing over a million in an hour.

Anthropic and OpenAI were responsible for over 70% of that traffic.

danaris 3 hours ago | parent | prev | next [-]

> Is there no rule of law anymore?

Have you not been paying attention to the news for the past few years?

No, there isn't. If there were, Trump would be in prison, not the Oval Office. And he and the Republican Party have deliberately fostered this environment of corruption and rule-by-wealth so that they can gain more power and even more wealth.

And now they are also backing the AI zealots, and techbros more generally, to ensure that they can do whatever the hell they want, damn the consequences to the rest of the world.

avazhi 5 hours ago | parent | prev | next [-]

Is what an offence lol? Bot scraper traffic?

How do you think search engines work?

mrweasel 3 hours ago | parent | next [-]

Search engines appear to care more about being good "Netizens". It's not like GoogleBot never crashed a site, but it's rare. Search engine bots check if they need to back off for a bit, they check etags, notices if page changes infrequently and slow down their crawler frequency.

If you train an LLM, it's not like you keep a copy of every page around, so there's no point to check if you need to re-scrape a page, you do, because you store nothing.

Personally I think people would be pretty indifferent to the new generation of scrapers, AI or other types, if they at least behaved and slowed down if they notice a site struggling. If they had the slightest bit of respect for others on the web, this wouldn't be an issue.

spiderfarmer 5 hours ago | parent | prev [-]

They work because they offer ways to opt out, they honor crawl delay, setting ideal scraping times, IndexNow, etc.

And they give you real, valuable traffic in return.

reddozen 5 hours ago | parent | prev [-]

Because the law deals with intent. The intent for a 12 year old skiddie with a ddos box is to harm someone else's internet. the intent of big scrapers is to collect data. if you want to make the latter illegal then vote for that instead of loading it with the normative baggage of the former.

It's the same problem as why Occupy Wallstreet fell apart: bunch of losers who don't understand the system screech about the system. because they don't understand it, they can't offer any meaningful dialogue about how to fix it beyond screeching.