Remix clone Hacker News

new | show | ask | jobs Github

	▲	jstanley a day ago
		I'm working on headless browser fingerprinting. We're focusing on "anti-cloaking" for anti-phishing and other Internet security applications at the moment. Phishing sites can "cloak" themselves so that they present malicious content to ordinary users and benign content to bots, and thereby evade detection. Anti-cloaking is doing things to defeat cloaking. The methodology is to operate a site that logs all requests, and collects information from the JavaScript environment, and looks for signals that a session is being operated by a bot instead of a human. We have 183 unique signals so far. We've seen fake mobile phone APIs being injected into the DOM, and have been able to read out the source code implementing them. We've seen lots of people running the browser with TLS validation and same-origin policy disabled, which are both easy to probe for. And we've even seen people running services on localhost with CORS headers that allow cross-origin requests, allowing us to read out their server headers and page contents and which would allow us to send arbitrary requests to their local servers. We've seen people using proxies that don't support websockets. We've even seen surprisingly-big companies scanning us from netblocks that just straightforwardly name the company, which would be trivial to block just by IP address. It turns out that every security vendor that scans VirusTotal submissions or domains from CT logs has major flaws in their headless browser setup which mean it's worryingly easy to cloak from them. I don't know the best angle for monetisation. Currently we are selling "quick overviews" of what people are doing wrong, but it kind of feels like we're giving away too much value too cheaply. However it's difficult to convince people that there is value worth paying for without telling them what they're doing wrong upfront before they pay. Ideas include: * automated quick overviews, where we give you a URL to point your bot at, find out all the signals you hit, and give you an automatically-generated report of what you are doing wrong * or a manual "pentest" of your headless browser, where we do the same thing but spend a few days manually looking harder to see if there are new signals we're not yet spotting automatically * or we could sell a report of the state of the industry as a whole * or access to our tooling * or something else I have been told that if I say it's for anti-phishing then I have 12 customers max but if I say it's for AI browser agents then someone will give me a billion dollars. So possibly we need to explore other applications, like either telling AI scrapers why they are getting blocked, or else helping sites block AI scrapers (though I am personally opposed to building the apartheid web). Open problems are: * what's the best form to sell it? * how do we satisfy people that if they pay for a test then they will get value from it? * should we pivot away from anti-phishing? * for bots that we notice have found us from VirusTotal or CT logs, how do we work out who is operating them so that we can sell to them? Sometimes attribution is easy but in the majority of cases it is not