Remix.run Logo
ok123456 6 days ago

Why is kernel.org doing this for essentially static content? Cache control headers and ETAGS should solve this. Also, the Linux kernel has solved the C10K problem.

mixologic 6 days ago | parent | next [-]

Because its static content that is almost never cached because its infrequently accessed. Thus, almost every hit goes to the origin.

ok123456 6 days ago | parent [-]

The contents in question are statically generated, 1-3 KB HTML files. Hosting a single image would be the equivalent of cold serving 100s of requests.

Putting up a scraper shield seems like it's more of a political statement than a solution to a real technical problem. It's also antithetical to open collaboration and an open internet of which Linux is a product.

whatevaa 6 days ago | parent | prev [-]

Bots don't respect that.

1gn15 6 days ago | parent [-]

Use a CDN.

trenchpilgrim 6 days ago | parent | next [-]

A great option for most people, and indeed Anubis' README recommends using Cloudflare if possible. However, not everyone can use a paid CDN. Some people can't pay because their payment methods aren't accepted. Some people need to serve content or to countries which a major CDN can't for legal and compliance reasons. Some organizations need their own independent infrastructure to serve their organizational misson.

Aachen 5 days ago | parent | prev [-]

So that someone else pays for your bandwidth while seeing who is interested in this content? Idk about that solution

ok123456 5 days ago | parent [-]

Maybe the Linux Foundation should cover kernel.org's hosting costs?

Aachen 4 days ago | parent [-]

Ah true, I think I might have forgotten the context. They're big enough to do that. Most people I see recommending a CDN are freeloading on some big corp's systems