Remix.run Logo
ErroneousBosh 4 hours ago

So wait a second then, it connects out using a websocket to its bot C&C server, right?

Which presumably passes it a URL to scrape and waits for it to return the data.

What happens if I write my own tool that connects to that C&C server, waits for a URL to scrape, and returns gigabytes of freshly brewed hot horseshit?

woffoor 4 hours ago | parent [-]

Most scrapped websites have https, so you need to perform a MITM attack. Scrapers will probably notice that.

voakbasda 3 hours ago | parent | next [-]

No, you just need to stand up your own website and feed the scraper a URL to it.

ErroneousBosh an hour ago | parent [-]

I would just generate scads of Markov chain output and make it look like a plausible web page.

ErroneousBosh 2 hours ago | parent | prev [-]

How would https affect it?

If they're making a request to my machine to go and curl a page, how do they even know whether or not it was https?

trumpdong 10 minutes ago | parent [-]

Not sure about Bright Data but these are usually SOCKS or HTTP CONNECT proxies because that's most flexible. But the customer might be paying by the gigabyte, so you can still feed them nonsense, maybe a 4 gigabyte TLS certificate.