Remix.run Logo
marticode 7 hours ago

As a user I do care, because I waste so much time on Cloudflare's "prove you are human" blocking-page (why do I have to prove it over and over again?), and frequently run on websites blocking me entirely based on some bad IP-blacklist used along with Cloudflare.

tempest_ 6 hours ago | parent | next [-]

Unfortunately the internet sucks in 2025.

If you have a site with valuable content the LLM crawlers hound you to no end. CF is basically a protection racket at this point for many sites. It doesnt even stop the more determined ones but it keeps some away.

seniorThrowaway 6 hours ago | parent | next [-]

Yep for anyone unaware of how awful things truly are, look up what a "residential proxy" is. Back in my day we called that a botnet.

nananana9 4 hours ago | parent | next [-]

Oh, they're still botnets. We just look the other way because they're useful.

And they're pretty tame as far as computer fraud goes - if my device gets compromised I'd much rather deal with it being used for fake YouTube views than ransomware or a banking trojan.

deadbabe 4 hours ago | parent | prev [-]

You can make a little bit of cash on the side letting companies use your bandwidth a bit for proxying. You won’t even notice. $50/month. Times are tough!

jamwil 2 hours ago | parent [-]

Of course the risk here being whatever nefarious or illegal shit is flowing through your pipes, which you consented to and even received consideration for.

deadbabe an hour ago | parent [-]

No worries it’s encrypted traffic

hollerith 3 hours ago | parent | prev | next [-]

CF would be a protection racket only if CF is the cause of the problem CF is charging money to solve.

j2kun 5 hours ago | parent | prev [-]

And yet half the HN front page every day is promoting LLM stuff.

"The internet sucks", yes, but we're doing it to ourselves.

kadushka 5 hours ago | parent | next [-]

Would you rather not have LLMs?

foobarchu 5 hours ago | parent | next [-]

Absolutely. They have dramatically worsened the world, with little to no net positive impact. Nearly every (if not all) positive impacts have an associated negative that that dwarfs it.

LLMs aren't going anywhere, but the world would be a better place if they hadn't been developed. Even if they had more positive impacts, those would not outweigh the massive environmental degradation they are causing or the massive disincentive they created against researching other, more useful forms of AI.

j2kun 5 hours ago | parent | prev | next [-]

IMO LLMs have been a net negative on society, including my life. But I'm merely pointing out the stark contrast on this website, and that fact that we can choose to live differently.

kadushka 2 hours ago | parent [-]

Are you anti-AI in general, or are you unhappy about the current LLMs?

j2kun 2 hours ago | parent [-]

I am not anti-AI, nor unhappy about how any current LLM works. I'm unhappy about how AI is used and abused to collective detriment. LLM scraper spam leading to increased centralization and wider impacting failures is just one example.

kadushka 42 minutes ago | parent [-]

Your position is similar to saying that medical drugs have been a net negative on society, because some drugs have been used and abused to collective detriment (and other negative effects, such as doctors prescribing pills instead of suggesting lifestyle changes). Does it mean that we would be better off without any medical drugs?

captainkrtek 4 hours ago | parent | prev | next [-]

hard yes, all of the technical discussion aside, the constant advertising deluge of every company touting AI is mind numbing.

seanw444 3 hours ago | parent | prev | next [-]

It's helped me learn some things quicker, but I definitely prefer the old days.

BrenBarn an hour ago | parent | prev | next [-]

Good lord yes. No question.

davidhaymond 4 hours ago | parent | prev | next [-]

Absolutely. And while we're at it, let's do away with social media.

ToucanLoucan 4 hours ago | parent | prev | next [-]

Yes.

A solid secondary option is making LLM scraping for training opt-in, and/or compensating sites that were/are scraped for training data. Hell, maybe then you could not knock websites over incentivizing them to use Cloudflare in the first place.

But that means LLM researchers have to respect other people's IP which hasn't been high on their todo lists as yet.

bUt ThAT dOeSn'T sCaLe - not my fuckin problem chief. If you as an LLM developer are finding your IP banned or you as a web user are sick of doing "prove you're human" challenges, it isn't the website's fault. They're trying to control costs being arbitrarily put onto them by a disinterested 3rd party who feels entitled to their content, which it costs them money to deliver. Blame the asshole scraping sites left and right.

Edit: and you wouldn't even need to go THAT far. I scrape a whole bunch of sites for some tools I built and a homemade news aggregator. My IP has never been flagged because I keep the number of requests down wherever possible, and rate-limit them so it's more in line with human like browsing. Like so much of this could be solved with basic fucking courtesy.

salawat 4 hours ago | parent | prev | next [-]

Can I raise that to no LLMs or SEO?

worik 3 hours ago | parent [-]

Yes

LLMs have become a crucial compendium of knowledge, that had become hidden behind SEO

stalfosknight 2 hours ago | parent | prev | next [-]

Yes

lenerdenator 4 hours ago | parent | prev | next [-]

Not to speak for the other poster, but... That's not a good-faith question.

Most of the problems on the internet in 2025 aren't because of one particular technology. They're because the modern web was based on gentleman's agreements and handshakes, and since those things have now gotten in the way of exponential profit increases on behalf of a few Stanford dropouts, they're being ignored writ large.

CF being down wouldn't be nearly as big of a deal if their service wasn't one of the main ways to protect against LLM crawlers that blatantly ignore robots.txt and other long-established means to control automated extraction of web content. But, well, it is one of the main ways.

Would it be one of the main ways to protect against LLM web scraping if we investigated one of the LLM startups for what is arguably a violation of the Computer Fraud and Abuse Act, arrested their C-suite, and sent each member to a medium-security federal prison (I don't know, maybe Leavenworth?) for multiple years after a fair trial?

Probably not.

j2kun 2 hours ago | parent | next [-]

Well said.

chasing0entropy 4 hours ago | parent | prev [-]

I'm Sure there will be an investigation... By the SEC when the bubble pops and takes the S&P with it. No prison though, probably jobs at the next ponzi scheme

nhhvhy 3 hours ago | parent | prev | next [-]

Yes.

LtWorf 3 hours ago | parent | prev | next [-]

Yes.

inferiorhuman 5 hours ago | parent | prev | next [-]

Yes

therein 3 hours ago | parent | prev [-]

Yes, they are terrible and more a negative force than a positive one in every way imaginable. I would take no LLMs all day every day.

roflyear 5 hours ago | parent | prev [-]

Unfortunately the problem isn't just "the internet sucks" it's "the internet sucks, and everyone uses it" - meaning people are not doing stuff offline, and a lot of our lives require us to be online.

worik 3 hours ago | parent [-]

The Internet is huming along beautifully

It is the Web that is being degraded

crazygringo 3 hours ago | parent | prev | next [-]

But that's not a problem caused by Cloudflare.

That's a problem caused by bots and spammers and DDoSers, that Cloudflare is trying to alleviate.

And you generally don't have to prove it over and over again unless there's a high-risk signal associated with you, like you're using a VPN or have cookies disabled, etc. Which are great for protecting your privacy, but then obviously privacy means you do have to keep demonstrating you're not a bot.

BarryMilo 3 hours ago | parent [-]

You might say the problem CloudFlare is causing is lesser than the ones it's solving, but you can't say they're not causing a new, separate problem.

That they're trying counts for brownie points, it's not an excuse to be satisfied with something that still bothers a lot of people. Do better, CloudFlare.

crazygringo 2 hours ago | parent [-]

Do better, how?

If you have any ideas on how to protect against bad actors in a way that is just as effective but easier for users, please share it.

Because as far as I can tell, this isn't a question of effort. It's a question of fundamental technological limitations.

woooooo 7 hours ago | parent | prev | next [-]

I just realized, why don't they have some "definitely human" third party cookie that caches your humanness for 24h or so? I'm sure there's a reason, I've heard third party cookies were less respected now, but can someone chime in on why this doesn't work and save a ton of compute?

acureau 7 hours ago | parent | next [-]

Because people will solve the challenge once, and then use the cookie in automation tools. It already happens with shorter expiration cookies.

woooooo 5 hours ago | parent [-]

Thanks, I'm now shaking my head at my naivete :)

basilikum 7 hours ago | parent | prev | next [-]

https://developers.cloudflare.com/waf/tools/privacy-pass/

arbol 5 hours ago | parent [-]

Are you really posting this today?

octoberfranklin 5 hours ago | parent | prev | next [-]

Yes, there are several, and the good one (linked below) lets you use the "humanness" token across different websites without them being able to use it as a tracking signal / supercookie. It's very clever.

https://github.com/ietf-wg-privacypass/base-drafts

https://privacypass.github.io/

lotsofpulp 7 hours ago | parent | prev [-]

I assume that will be for Apple (and eventually Alphabet) to implement via digital IDs linked to real world IDs.

https://www.apple.com/newsroom/2025/11/apple-introduces-digi...

philipwhiuk 6 hours ago | parent [-]

Don't worry, Sam Altman is selling the protection too -- https://en.wikipedia.org/wiki/World_(blockchain)

edm0nd 6 hours ago | parent | prev | next [-]

Congrats, you now know what it's like to be a daily Tor user trying to hit normie sites from exit node IPs xD

replwoacause 5 hours ago | parent [-]

Why would anyone be a daily Tor user and trying to hit clear-net sites on top of that? This sounds like a bizarre usecase.

pinko 4 hours ago | parent | next [-]

Privacy through uniformity, operational security by routine, herd immunity for privacy, traffic normalization, "anonymity set expansion", "nothing to hide" paradox, etc.

I.e., if you use Tor for "normie sites", then the fact that someone can be seen using Tor is no longer a reliable proxy for detecting them trying to see/do something confidential and it becomes harder to identify & target journalists, etc. just because they're using Tor.

replwoacause 4 hours ago | parent [-]

Huh never thought about that. I wonder how many people do that? Seems like a public service.

milderworkacc 4 hours ago | parent [-]

It certainly feels like one at times!

dooglius 3 hours ago | parent | prev | next [-]

In addition to the reasons in sibling comment, this also acts as a filter for low-quality ad-based sites; same reason I close just about any website that gives me a popup about a ToS agreement.

4 hours ago | parent | prev [-]
[deleted]
jakub_g 7 hours ago | parent | prev [-]

I hate it as much (and the challenge time seems to be getting longer, 10s lately for me, what the hell?)

But we can all say thank you to all the AI crawlers who hammer websites with impossible traffic.

pixl97 6 hours ago | parent [-]

I mean, it was a problem before AI crawlers with just bots and attacks in general.

olyjohn 4 hours ago | parent [-]

It wasn't nearly as bad.