Remix.run Logo
l1k 9 hours ago

Fun fact (or not so fun if you're a subscriber):

Somebody is spamming kernel mailing lists under the name Marian Corcodel with a 26 MByte message multiple times per day containing a collection of nonsensical patches. Looks AI-generated, perhaps with the intention to poison LLMs. This has been going on for a few days now.

https://lore.kernel.org/all/CAGg4U=GNtCObd_Nbm_1Rr5FEvPb69Yz...

probably_wrong 9 hours ago | parent | next [-]

I'd warn HN users not to click on that link simply because it will load a 26Mb message that will likely cause quite a strain on kernel.org's servers if everyone here does it.

sillysaurusx 7 hours ago | parent | next [-]

I was curious how much of an impact HN could have. Napkin math:

HN gets 24M views a day. Assume those views are evenly distributed across the front page (they aren’t), and that’s about 1M views for each front page post, assuming each user clicks on one post.

By the rule of 10s (also not exact), there are 10x less views on comment threads. So assume around 100k views on a comment thread as a theoretical average.

If everyone in this thread clicked on the link, that would be 2.6 TB of transfer across the day. But by the rule of 10’s we have to assume 10x fewer people will interact (upvote, click, anything) than view. So we’re down to 260GB transfer over the course of a day.

I wonder how close that is. It seems plausible that a link in the top comment of a thread could garner 10,000 clicks.

That’s still about one click every 8 seconds, which at 10Mbit/s would indeed overwhelm the server by a factor of about 2.5x. But I clicked through and it loaded in just a few seconds, so presumably the pipe is faster than 10Mbit/s.

Another caveat is that many websites are already several megabytes, so it seems strange that 26Mb would be the breaking point for a reasonable web host.

devsda 4 hours ago | parent | next [-]

Don't forget scrapers. Scrapers can be biased towards top posts and comments.

perching_aix 6 hours ago | parent | prev | next [-]

> HN gets 24M views a day

This is available info?

shagie 4 hours ago | parent [-]

https://news.ycombinator.com/item?id=33450094

2022 from dang:

> There's no stats page but last I checked it was around 5M monthly unique users (depending on how you count them), perhaps 10M page views a day (including a guess at API traffic), and something like 1300 submissions (stories) and 13k comments a day.

> The most interesting number is the 1300 submissions because that hasn't grown since 2011 - it just fluctuates. Everything else has been growing more or less linearly for a long time, which is how we like it.

kraftman 6 hours ago | parent | prev [-]

Plenty of people deliberately posting to HN have their servers overwhelmed.

jedberg 2 hours ago | parent | prev | next [-]

It's mirrored by Akamai, which is designed to repeatedly serve the same thing over and over. It won't really hurt anyone.

jmalicki 8 hours ago | parent | prev | next [-]

Does a 26MB message actually cause noticeable strain on the server much beyond loading the page? I would think serving a contiguous 26MB chunk would be relatively similar to say 20 normal sized messages.

mort96 5 hours ago | parent | next [-]

Way off. I went to an arbitrary message on lore.kernel.org. Firefox's network inspector says 7.37kB was transferred, including stylesheets. 26MB is roughly 3500x 7.37kB.

jmalicki 4 hours ago | parent [-]

Data transferred is not what generates load. sendfile() is about the lowest-overhead thing a web server does.

7 hours ago | parent | prev [-]
[deleted]
leonidasrup 8 hours ago | parent | prev | next [-]

https://web.archive.org/web/20260518134447/https://lore.kern...

OuterVale 8 hours ago | parent | next [-]

I don't think needlessly straining the Internet Archive's servers is any better.

embedding-shape 7 hours ago | parent [-]

IA's infra is slightly better for big loads though, they tend to just have higher latency rather than aborted/timed out requests, for better or worse. It can be bit slow, but as long as you're ready to wait, you'll eventually get the response. Usually hosts just cut you off with a hardcoded timeout instead, which for people on high latency/low bandwidth connections can be super fun.

grosswait 8 hours ago | parent | prev [-]

Will clicking on this link download a 26MB message putting extra load on archive.org's servers?

neksn 5 hours ago | parent | prev | next [-]

The page is gzipped in transit - only 5 MB of traffic are generated.

shevy-java 7 hours ago | parent | prev [-]

Thank you for the warning. I rarely click on links these days though; only exception I make for HN links for main articles.

embedding-shape 7 hours ago | parent [-]

How do you navigate the web, everything is CTRL+L then manually type the address, or you have some fancier solution?

kelsey98765431 6 hours ago | parent [-]

the web is useless outside of hn

embedding-shape 5 hours ago | parent [-]

90% of it yeah, but the 10% is still worth it, like HN.

Phelinofist 6 hours ago | parent | prev [-]

> perhaps with the intention to poison LLMs

How does that work?

stefan_ 6 hours ago | parent [-]

This is just nonsensical changes and slurs, but particularly degenerate input data can cause big issues in training:

https://x.com/gabriberton/status/2051873677998956851