Remix.run Logo
WaltPurvis 3 days ago

http://archive.today/SqPCL

jmkni 2 days ago | parent [-]

It is a bit ironic that a paywalled article like this will have a top level comment with the archive link, which can then be easily scraped by AI (along with the comments)

ec109685 2 days ago | parent | next [-]

Also interesting how sites like this are mainstream whereas a link to a site hosting an mp3 of pirated music wouldn’t be tolerated in discussion forums like this.

I think a big difference is that there’s no micro transactions or compulsory licensing for content, so it always feels patently unfair to buy a subscription to read one article.

yencabulator 2 days ago | parent [-]

I'd argue it's more that RIAA has historically been much more aggressive at suing than newspapers or magazines.

ec109685 2 days ago | parent [-]

True. I think it has ended up a net good. People make a living on music, and licensed music is everywhere.

orbisvicis 2 days ago | parent | prev | next [-]

Kinda hard to discuss the news when your members can't read the news.

JacobKfromIRC 2 days ago | parent | prev | next [-]

In this case, it also seems like the paywall doesn't show up if you have JavaScript disabled, which I find strange, but lots of news sites are like that I think.

euroderf 2 days ago | parent | prev | next [-]

Related: Has anyone trained an LLM strictly on HN comments and linked-to articles ? I for one would get a kick out of interrogating it.

tenuousemphasis 2 days ago | parent | prev [-]

It's not ironic at all. The only reason the anti-paywall sites work is that the news companies in fact want some scrapers reading the full article.

mschuster91 2 days ago | parent | next [-]

Actually, the team behind archive dot today in at least spiegel.de has premium accounts, I presume bought with anonymous credit cards.

You can see artifacts when their servers are at queue load and you see the URLs, a few resources have the JWT with the account details in the URL. IIRC the clearname of the account in the token is Masha Rabinovich, with an email account masha@dns.li, an identity that has cropped up in various investigations [1][2].

[1] https://gyrovague.com/2023/08/05/archive-today-on-the-trail-...

[2] https://webapps.stackexchange.com/questions/145817/who-owns-...

2 days ago | parent [-]
[deleted]
2 days ago | parent | prev [-]
[deleted]