▲ | dylan604 3 days ago | ||||||||||||||||||||||||||||||||||||||||
> For the web it requires that you run a snippet of javascript code (the challenge) in the browser to prove that you are not a bot. How does this prove you are not a bot. How does this code not work in a headless Chromimum if it's just client side JS? | |||||||||||||||||||||||||||||||||||||||||
▲ | Andrews54757 3 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
Good question! Indeed you can run the challenge code using headless Chromium and it will function [1]. They are constantly updating the challenge however, and may add additional checks in the future. I suppose Google wants to make it more expensive overall to scrape Youtube to deter the most egregious bots. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
▲ | Beretta_Vexee 3 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
Once JavaScript is running, it can perform complex fingerprinting operations that are difficult to circumvent effectively. I have a little experience with Selenium headless on Facebook. Facebook tests fonts, SVG rendering, CSS support, screen resolution, clock and geographical settings, and hundreds of other things that give it a very good idea of whether it's a normal client or Selenium headless. Since it picks a certain number of checks more or less at random and they can modify the JS each time it loads, it is very, very complicated to simulate. Facebook and Instagram know this and allow it below a certain limit because it is more about bot protection than content protection. This is the case when you have a real web browser running in the background. Here we are talking about standalone software written in Python. | |||||||||||||||||||||||||||||||||||||||||
|