Remix.run Logo
Micanthus 2 hours ago

The page specifically says it's okay for bots to scrape from Anna's Archive, she just asks they do it in bulk to not overload the servers:

"""

> We are a non-profit project with two goals:

> 1. Preservation: Backing up all knowledge and culture of humanity.

> 2. Access: Making this knowledge and culture available to anyone in the world (including robots!).

[. . .]

  * Our website has CAPTCHAs to prevent machines from overloading our resources, but all our data can be downloaded in bulk:

  * All our HTML pages (and all our other code) can be found in our [GitLab repository](https://software.annas-archive.gl/).

  * All our metadata and full files can be downloaded from our [Torrents page](/torrents), particularly `aa_derived_mirror_metadata`.
  
  * All our torrents can be programatically downloaded from our [Torrents JSON API](https://annas-archive.gl/dyn/torrents.json).
"""