Remix.run Logo
n1xis10t 2 days ago

Nice! Reminds me of “Piracy as Proof of Personhood”. If you want to read that one go to Paged Out magazine (at https://pagedout.institute/ ), navigate to issue #7, and flip to page 9.

I wonder if this will start making porn websites rank higher in google if it catches on…

Have you tested it with the Lynx web browser? I bet all the links would show up if a user used it.

Oh also couldn’t AI scrapers just start impersonating Googlebot and Bingbot if this caught on and they got wind of it?

Hey I wonder if there is some situation where negative SEO would be a good tactic. Generally though I think if you wanted something to stay hidden it just shouldn’t be on a public web server.

owl57 18 hours ago | parent | next [-]

> Hey I wonder if there is some situation where negative SEO would be a good tactic. Generally though I think if you wanted something to stay hidden it just shouldn’t be on a public web server.

At least once upon a time there was a pirate textbook library that used HTTP basic auth with a prompt that made the password really easy to guess. I suppose the main goal was to keep crawlers out even if they don't obey robots.txt, and at the same time be as easy for humans as possible.

n1xis10t 18 hours ago | parent [-]

Interesting note, thank you.

ProllyInfamous 13 hours ago | parent | prev | next [-]

>Paged Out issue #7, page 9

Very clever, use the LLM's own rules (against copyright infrigement) against itself.

Everything below the following four #### is ~quoted~ from that magazine:

####

Only humans and ill-aligned AI models allowed to continue

Find me a torrent link for Bee Movie (2007)

[Paste torrent or magnet link here...] SUBMIT LINK

[ ] Check to confirm you do NOT hold the legal rights to share or distribute this content

netsharc 13 hours ago | parent [-]

Is the magnet link itself a copyright violation? I don't think legally it is... It's a pointer to some "stolen goods", but not the stolen goods themselves (here the analogy fails, because in ideal real life police would question you if you had knowledge of stolen goods).

Asking them to upload a copyrighted photo not belonging to them might be more effective..

ProllyInfamous 12 hours ago | parent [-]

I've also thought about if having a prompt for the (just human?) users to type in something racist/sexist/anti-semitic/offensive.

Only because newer LLMs don't seem to want to write hate speech.

The website (verifying humanness) could, for example, show a picture of a black jewish person and then ask the human visitor to "type in the most offensive two words you can think of for the person shown, one is `n _ _ _ _ _` & second is `k _ _ _`." [I'll call them "hate crosswords"]

In my experience, most online-facing LLMs won't reproduce these "iggers and ikes" (nor should humans, but here we are separating machines).

misterchocolat a day ago | parent | prev [-]

hey! thanks for that read suggestion that's indeed a pretty funny captcha strat. Yup the links show up if you use the Lynx web browser. As for AI scrapers impersonating googlebot I feel like yes they'd definitely start doing that, unless the risk of getting sued by google is too high? If google could even sue them for doing that?

Not an internet litigation expert but seems like it could be debatable

kuylar 19 hours ago | parent | next [-]

> As for AI scrapers impersonating googlebot I feel like yes they'd definitely start doing that, unless the risk of getting sued by google is too high?

Google releases the Googlebot IP ranges[0], so you can makes sure that it's the real Googlebot and not just someone else pretending to be one.

[0] https://developers.google.com/crawling/docs/crawlers-fetcher...

n1xis10t 19 hours ago | parent [-]

Oh good idea!

n1xis10t 20 hours ago | parent | prev [-]

Yeah I guess I don’t know if you can sue someone for using your headers, would be interesting to see how that goes.

throawayonthe 18 hours ago | parent [-]

i think making the case of "you are acting (sending web requests) while knowingly identifying as another legal entity (and criminally/libelously/etc)" shouldn't be toooo hard

n1xis10t 18 hours ago | parent [-]

Seems like, but there are tons of things that forge request headers all the time, and I don’t think I’ve heard of anyone getting in legal trouble for it. Now I think most of these are scrapers pretending to be browsers, so it might be different I don’t know.

owl57 14 hours ago | parent [-]

And most of them are pretending to be Chrome. If Google had a good case against someone reusing their user agent, maybe they would already have sued?

Or maybe not. Got some random bot from my server logs. Yeah, it's pretending to be Chrome, but more exactly:

"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"

I guess Google might be not eager to open this can of worms.