Remix.run Logo
sethops1 19 hours ago

I have a site with a complete and accurate sitemap.xml describing when its ~6k pages are last updated (on average, maybe weekly or monthly). What do the bots do? They scrape every page continuously 24/7, because of course they do. The amount of waste going into this AI craze is just obscene. It's not even good content.

thisislife2 12 hours ago | parent | next [-]

If you are in the US, have you considered suing them for robot.txt / copyright violation? AI companies are currently flush with cash from VCs and there may be a few big law firms willing to fight a law suit against them on your behalf. AI companies have already lost some copyright cases.

happymellon 10 hours ago | parent [-]

Based upon traffic you could tell whether an IP or request structure is coming from a not, but how would you reliability tell which company is DDOSing you?

chrismorgan 9 hours ago | parent [-]

It should be at least theoretically possible: each IP address is assigned to an organisation running the IP routing prefix, and you can look that up easily, and they should have some sort of abuse channel, or at the very least a legal system should be able to compel them to cooperate and give up the information they’re required to have.

n1xis10t 19 hours ago | parent | prev [-]

It would be interesting if someone made a map that depicts the locations of the ip addresses that are sending so many requests, over the course of a day maybe.

giantrobot 17 hours ago | parent [-]

Maps That Are Just Datacenters