Remix.run Logo
jdidrirjrjo 2 hours ago

> We backed up Spotify (metadata and music files) ....(~300TB),

https://annas-archive.gl/blog/backing-up-spotify.html

But it is not ok to scrape our data!

Micanthus 2 hours ago | parent | next [-]

The page specifically says it's okay for bots to scrape from Anna's Archive, she just asks they do it in bulk to not overload the servers:

"""

> We are a non-profit project with two goals:

> 1. Preservation: Backing up all knowledge and culture of humanity.

> 2. Access: Making this knowledge and culture available to anyone in the world (including robots!).

[. . .]

  * Our website has CAPTCHAs to prevent machines from overloading our resources, but all our data can be downloaded in bulk:

  * All our HTML pages (and all our other code) can be found in our [GitLab repository](https://software.annas-archive.gl/).

  * All our metadata and full files can be downloaded from our [Torrents page](/torrents), particularly `aa_derived_mirror_metadata`.
  
  * All our torrents can be programatically downloaded from our [Torrents JSON API](https://annas-archive.gl/dyn/torrents.json).
"""
the_af 2 hours ago | parent | prev | next [-]

> But it is not ok to scrape our data!

They want people and LLMs to download their data, which is why they point to the more efficient ways of doing so. They are not blocking access to the data, they just reroute it.

If you're going to create a last minute account to criticize something, it pays to at least read what you're criticizing.

_ink_ 2 hours ago | parent | prev [-]

I mean, if Spotify would provide a nice way to download their music (which they also pirated back in the days when they had no money but an idea) annas archive would not need to use scraping.

jdidrirjrjo 2 hours ago | parent [-]

[flagged]

petu 2 hours ago | parent | next [-]

It's digital copies, no real damage is done.

AA asks you to not scrape them because of server load and provides torrents to download everything in more efficient manner.

the_af 2 hours ago | parent | prev [-]

Unlike Spotify, AA is a nonprofit. It's more urgent for them to prevent costly extraction of data. Spotify can do this too, if they so wanted.

It's not about consent, obviously AA is infringing.

BTW, why did you create a last minute account just to criticize AA?