Remix.run Logo
afandian 2 days ago

Just very high usage all of a sudden, after years of reasonable usage. Google has indexed it (respectfully) since 2008 just fine.

New traffic isn't humans. I blocked some AI scraper user-agents, which helped, a bit. But most new user agents are identifying as vanilla browsers, not scrapers.

I don't have numbers. It was enough to consume all nginx worker_connections. Raising the number doesn't help, as it's just reverse proxying to JVM.

After the switch, Cloudflare showed USA and Singapore as heavy traffic sources.

I don't mind scrapers on the site, but app is a search engine (of sorts) so every page view consumes some CPU. Including 'facet this search' buttons. My (WIP) solution is to rewrite to make it all client-side and put it all on a CDN.

bayindirh 2 days ago | parent [-]

> The user agents are vanilla browsers, not identifying as scrapers.

This is how they get you, alongside with "residential proxy" services they use. They appear to be benign browsers from various homes.