Remix.run Logo
denysvitali 3 hours ago

Cloudflare is known to use fingerprinting to detect scrapers For example, they use JA3 fingerprints and match them against the UA to block stuff like cURL while allowing OkHttp (Android clients) - but this can be easily be spoofed with packages such as CycleTLS [1].

I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Cromite, a privacy conscious fork of Chromium for Android, has constantly issues with CloudFlare Turnstile [2] because they (Cloudflare) try to fingerprint it in multiple ways in order to pass the challenge. The only way to get it to work would be to join the CloudFlare Browser Developer program - which requires signing an NDA. Rightfully so, the project maintainer didn't want to do it.

If you want to see the extent of what CloudFlare does to fingerprint the browsers, just have a look in the issue [2] and see which flags need to be disabled in order to allow CloudFlare to pass the challenge.

I understand both sides, but at least CloudFlare could be flexible enough to fall back to PoW instead of just blocking people from sending forms or accessing websites...

[1]: https://github.com/Danny-Dasilva/CycleTLS

[2]: https://github.com/uazo/cromite/issues/2365

sandeepkd 9 minutes ago | parent | next [-]

> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Bot protection with fingerprinting is just an illusion. Any signals like this which is on client side can be spoofed by an above average person. Fingerprinting is just way to consolidate the market for advertising business. Assigning Reputation to residential IP addresses and commercial blocks is is another approach to achieve the desired result. Providers would be a lot more careful to allow their IP addresses for misuses, however turns out that it would bring down the DDOS business on both sides, attackers and protectors.

Ironically, more than often its the same companies that invest in building their own bots and finding ways to stop bots from other companies.

jwr an hour ago | parent | prev | next [-]

> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection"

They also gate away a good many people with their "bot protection". I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

denysvitali 39 minutes ago | parent | next [-]

They sometimes have to comply with legal requests (which I understand), but at the same time they have a huge market share - which means that the internet is becoming less and less decentralized and more in their control. We've seen the effects of that in previous outages...

stackghost 16 minutes ago | parent | prev [-]

>I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

I think the Web is on its last legs, anyway. Generative AI and LLM-instead-of-search has destroyed what little value remained.

b65e8bee43c2ed0 2 hours ago | parent | prev | next [-]

it's all for nothing, because Cloudflare's scraping protection works about as well as a $5 padlock - good enough to dissuade bored teens, not good enough to dissuade even an amateur burglar. if someone wants to scrap your publicly visible data, they will. there's nothing you can do.

mootothemax 9 minutes ago | parent | next [-]

Exactly. I’m constantly amazed at how little you actually need to bypass CF, Amazon, Azure WAFs and so on (Incapsula springs to mind too). When you look at the code you’ve come up with, it’s actually quite small and compact.

More to the point, these systems actually help scraping because proof of work unlocks essentially unlimited scraping, in my experience.

That said - from my experience on the other side, sure you can’t stop people like me or you, but you can stop 99% of the others. That’s more than worth it operationally.

ACCount37 2 hours ago | parent | prev [-]

At the same time: it sure works well enough to annoy anyone with a "bad ASN" IP with 80 captchas a day.

shideneyu an hour ago | parent [-]

exactly that's what I was thinking... like the day they provided a solution to the issue they posed

petu an hour ago | parent | prev | next [-]

> but unless you do PoW (which is also ecologically a nightmare)

Can you expand? I don't see a problem with some napkin math. 5W load for 2 seconds is 0.002Wh (we have to let smartphones pass and not by doing PoW for 10s of seconds). 8 billion checks a day for a year = 8GWh.

denysvitali 42 minutes ago | parent [-]

I stand corrected. It's not a nightmare scenario (as for Bitcoins) - but I'm still of the idea that "useless" computations should be avoided (as we should avoid having 10MB websites).

In any case, according to some napkin math done by Kimi 2.6 (which by itself is probably already consuming more than all of my PoW challenges for the upcoming 5 years) - the situation looks incredibly in favor of PoW: https://www.kimi.com/share/19e7ef40-a432-8912-8000-0000b4a71...

Which makes me wonder why CloudFlare isn't switching to this already

dcrazy 34 minutes ago | parent [-]

Because it doesn’t solve the problem of residential botnets.

PearlRiver 2 hours ago | parent | prev [-]

This is why I have two separate browsers. If you want to do official stuff like paying for things you need to get through cloudflare.

notafox an hour ago | parent | next [-]

You can use Firefox with different profiles and configure it to launch particular profile directly, without launching default profile and using about:profiles.

Firefox with a non-default profile can be created like that:

  ./firefox -CreateProfile "profile-name /home/user/.mozilla/firefox/profile-dir/"
  # For, say, cloudflare that would be:
  ./firefox -CreateProfile "cloudflare /home/user/.mozilla/firefox/cloudflare/"
And you can launch it like that:

  ./firefox -profile "/home/user/.mozilla/firefox/profile-dir/"
  # For cloudflare that would be:
  ./firefox -profile "/home/user/.mozilla/firefox/cloudflare/"
So, given that /usr/bin/firefox is just a shell script, you can

    - create a copy of it, say, /usr/bin/firefox-cloudflare
    - adjust the relevant line, adding the -profile argument
If you use an icon to run firefox (say, /usr/share/applications/firefox.desktop), you'll need to do copy/adjust line for the icon.

Of course, "./firefox" from examples above should be replaced with the actual path to executable. For default installation of Firefox the path would be in /usr/bin/firefox script.

So, you can have a separate profiles for something sensitive/invasive (linkedin, cloudflare, shops, banks, etc.) and then you can have a separate profile for everything else.

And each profile can have its own set of extensions.

t_mahmood 8 minutes ago | parent | next [-]

You do now do this from `Profiles` menu too, without going down to CLI path. It's extremely simple now.

ferfumarma an hour ago | parent | prev [-]

Except that fingerprinting means that both profiles are actually tied together by cloudflare (and other tech companies)

helterskelter 2 hours ago | parent | prev [-]

Firefox added profile switching recently. Works good.

(That said, I still keep separate machines. One for doing "official" things, the other for everything else)

notafox an hour ago | parent | next [-]

> Firefox added profile switching recently.

I think this was as recent as 25 years ago?

Recently they added some new UI. There was and still is (I think) classic Profile Manager UI, which you can launch with

  ./firefox -ProfileManager
or access UI in about:profiles.

But you don't have to use any of those anyway - see my comment above (a response to parent).

thayne 2 minutes ago | parent | next [-]

The old UI was pretty difficult to use, and hard to discover unless you knew where to look though.

opem 37 minutes ago | parent | prev [-]

They actually have at least 3 kinds of profile: 1. containers - As they say its somekind of sandbox, technically a profile 2. profiles that are accesible through about:proflies, which they had for years, and probably the one you are talking about... 3. New profiles that comes with a pop-up much like how chromium browsers shows it

ajb 2 hours ago | parent | prev | next [-]

Odd - they've had that for years, but only on the command line. Wonder if it's different under the hood? They also have firefox containers which also never quite became a first-class feature (you have to install a plugin).

an hour ago | parent [-]
[deleted]
b65e8bee43c2ed0 2 hours ago | parent | prev [-]

>Works good.

does it? same binary, same machine, same display, same 781 other heuristics.