This is a fundamental misunderstanding of what those bots are requesting. They aren’t parsing those PHP files, they are using their existence for fingerprinting — they are trying to determine the existence of known vulnerabilities. They probably immediately stop reading after receiving a http response code and discard the remainder of the request packets.

▲

holysoles an hour ago | parent | next [-]

You're right, something like fail2ban or crowdsec would probably be more effective here. Crowdsec has made it apparent to me how much vulnerability probing is done, its a bit shocking for a low-traffic host.

▲

ajsnigrutin an hour ago | parent [-]

And you'd ban the ip, their one day lease on the VM+IP would expire, someone else will get the same IP on a new VM and be blocked from everywhere.

Would be usable to ban the ip for a few hours to have the bot cool down for a bit and move onto a next domain.

	▲	holysoles 26 minutes ago \| parent [-]
		I was referring to the rules/patterns provided by crowdsec rather than the distribution of known "bad" IPs through their Central API. The default ban for traffic detected by your crowdsec instance is 4 hours, so that concern isn't very relevant in that case. The decisions from the Central API from other users can be quite a bit longer (I see some at ~6 days), but you also don't have to use those if you're worried about that scenario.

▲

mattgreenrocks 2 hours ago | parent | prev [-]

It would be such a terrible thing if some LLM scrapers were using those responses to learn more about PHP, especially because of that recent paper pointing out it doesn't take that many data points to poison LLMs.