Remix.run Logo
bayindirh 3 days ago

I have recently found out that the snapshots have a "why?" field. The archivers might not be internet archive themselves, but commoncrawl, archive team, etc. pushing your site to Internet Archive.

Look at the reason, and get mad to the correct people.

It might be the archive themselves, but just be sure.

muppetman 3 days ago | parent [-]

Thanks - wasn't aware. (why: certificate-transparency, open-research-datasets, webwidecrawl)

I still don't fathom why they just _ignore_ the request not to be scraped with the above headers. It's rude.