Remix.run Logo
mojosam 6 hours ago

> Someone with a subscription logs into the site, then archives it.

That’s not the case. I don’t have a NYT subscription, I just Googled for an old obscure article from 1989 on pork bellies I thought would be unlikely for archive.today to have cached, and sure enough when I asked to retrieve that article, it didn’t have it and began the caching process. A few minutes later, it came up with the webpage, which if you visit on archive.is, you can see it was first cached just a few minutes ago.

https://www.nytimes.com/1989/11/01/business/futures-options-...

My assumption has been that the NYT is letting them around the paywall, much like the unrelated Wayback Machine. How else could this be working? Only way I could think it could work is that either they have access to a NYT account and are caching using that — something I suspect the NYT would notice and shutdown — or there is a documented hole in the paywall they are exploiting (but not the Wayback Machine, since the caching process shows they are pulling direct from the NYT).