Remix.run Logo
rootsudo 6 days ago

When I instantly read it, I knew it was anubis. I hope the anime catgirls never disapear from that project :)

hdndiebf 6 days ago | parent | next [-]

This anime thing is the one thing about computer culture that I just don't seem to get. I did not get it as child, when suddenly half of children cartoons became animes and I just disliked the aestheic. I didn't get it in school, when people started reading mangas . I'll probably never get it. Therefore I sincerely hope, they do go away from anubis, so I can further dwell in my ignorance.

timcambrant 6 days ago | parent | next [-]

I feel the same. It's a distinct part of nerd culture.

In the '70s, if you were into computers you were most likely also a fan of Star Trek. I remember an anecdote from the 1990s when an entire dial-up ISP was troubleshooting its modem pools because there were zero people connected and they assumed there was an outage. The outage happened to occur exactly while that week's episode of X-Files was airing in their time zone. Just as the credits rolled, all modems suddenly lit up as people connected to IRC and Usenet to chat about the episode. In ~1994 close to 100% of residential internet users also happened to follow X-Files on linear television. There was essentially a 1:1 overlap between computer nerds and sci-fi nerds.

Today's analog seems to be that almost all nerds love anime and Andy Weir books and some of us feel a bit alienated by that.

SnuffBox 5 days ago | parent | next [-]

> Today's analog seems to be that almost all nerds love anime and Andy Weir books and some of us feel a bit alienated by that.

Especially because (from my observation) modern "nerds" who enjoy anime seem to relish at bringing it (and various sex-related things) up at inappropriate times and are generally emotionally immature.

It's quite refreshing seeing that other people have similar lines of thinking and that I'm not alone in feeling somewhat alienated.

cdrini 6 days ago | parent | prev [-]

I think I'd push back and say that nerd culture is no longer really a single thing. Back in the star trek days, the nerd "community" was small enough that star trek could be a defining quality shared by the majority. Now the nerd community has grown, and there are too many people to have defining parts of the culture that are loved by the majority.

Eg if the nerd community had $x$ people in the star trek days, now there are more than $x$ nerds who like anime and more than $x$ nerds who dislike it. And the total size is much bigger than both.

armada651 6 days ago | parent | prev | next [-]

But what if they choose a different image that you don't get? What if they used an abstract modern art piece that no one gets? Oh the horror!

Aachen 6 days ago | parent | prev | next [-]

You don't have to get it to be able to accept that others like it. Why not let them have their fun?

This sounds more as though you actively dislike anime than merely not seeing the appeal or being "ignorant". If you were to ignore it, there wouldn't be an issue...

account42 6 days ago | parent [-]

They can have their fun on their personal websites. Subjecting others to your "fun" when you knows it annoys them is not cool.

Aachen 6 days ago | parent | next [-]

Well, this is their personal project. You're welcome to make your own, or you can remove the branding if you want: it's open licensed. Or if you're not a coder, they even offer to remove the branding if you support the project

I don't get the impression that it's meant to be annoying, but a personal preference. I can't know that, though whitelabeling is a common thing people pay for without the original brand having made their logo extra ugly

balamatom 6 days ago | parent | prev [-]

While subjecting the entire Internet to industrial-scale abuse by inconsiderate and poorly written crawlers for the sake of building an overhyped "whatever" is of course perfectly acceptable.

Aachen 5 days ago | parent [-]

Well, to be fair, that's not our doing so not really an argument for why one should accept something one apparently dislikes (I myself find the character funny and it brings a fun moment when it flashes by, but I can understand/accept that others see it differently of course)

balamatom 5 days ago | parent [-]

Yes, an argument for why one should accept something one apparently dislikes usually only works when it's from authority.

balamatom 6 days ago | parent | prev [-]

Might've caught on because the animes had plots, instead of considering viewers to have the attention spans of idiots like Western kids' shows (and, in the 21st century, software) tend to do.

timcambrant 6 days ago | parent [-]

I don't think it's relevant to debate if anime or other forms of media is objectively better. But as someone who has never understood anime, I view mainstream western TV series as filled with hours of cleverly written dialogue and long story arches, whereas the little anime I've watched seems to mostly be overly dramatic colorful action scenes with intense screamed dialogue and strange bodily noises. Should we maybe assume that we are both a bit ignorant of the preferences of others?

balamatom 6 days ago | parent [-]

Let's rather assume that you're the kind of person who debates a thing by first saying that it's not relevant to debate, then putting forward a pretty out-of-context comparison, and finally concluding that I should feel bad about myself. That kind of story arc does seem to correlate with finding mainstream Western TV worthwhile; there's something structurally similar to the funny way your thought went.

bawolff 6 days ago | parent | prev | next [-]

Its nice to see there is still some whimsy on the internet.

Everything got so corporate and sterile.

account42 6 days ago | parent [-]

Everyone copying the same Japanese cartoon style isn't any better than everyone copying corporate memphis.

6 days ago | parent | next [-]
[deleted]
lordhumphrey 6 days ago | parent | prev [-]

I think it definitively would be. Perhaps a small one, but still

ghssds 6 days ago | parent | prev | next [-]

As Anubis the egyptian god is represented as a dog-headed human, I thought the drawing was of a dog-girl.

nemomarx 6 days ago | parent [-]

Perhaps a jackal girl? I guess "cat girl" gets used very broadly to mean kemomimi (pardon the spelling) though

m4rtink 6 days ago | parent [-]

kemono == animal

mimi == ears

Der_Einzige 6 days ago | parent | prev | next [-]

It's not the only project with an anime girl as its mascot.

ComfyUI has what I think is a foxgirl as its official mascot, and that's the de-facto primary UI for generating Stable Diffusion or related content.

SnuffBox 5 days ago | parent [-]

I've noticed the word "comfy" used more than usual recently and often by the anime-obsessed, is there cultural relevance I'm not understanding?

AlexeyBelov 4 days ago | parent [-]

OK, you've been all over this thread being negative and angry. On a new account, which makes it even more sus. Take a break from social media.

bakugo 6 days ago | parent | prev | next [-]

It's more likely that the project itself will disappear into irrelevance as soon as AI scrapers bother implementing the PoW (which is trivial for them, as the post explains) or figure out that they can simply remove "Mozilla" from their user-agent to bypass it entirely.

debugnik 6 days ago | parent | next [-]

> as AI scrapers bother implementing the PoW

That's what it's for, isn't it? Make crawling slower and more expensive. Shitty crawlers not being able to run the PoW efficiently or at all is just a plus. Although:

> which is trivial for them, as the post explains

Sadly the site's being hugged to death right now so I can't really tell if I'm missing part of your argument here.

> figure out that they can simply remove "Mozilla" from their user-agent

And flag themselves in the logs to get separately blocked or rate limited. Servers win if malicious bots identify themselves again, and forcing them to change the user agent does that.

throwawayffffas 6 days ago | parent | next [-]

> That's what it's for, isn't it? Make crawling slower and more expensive.

The default settings produce a computational cost of milliseconds for a week of access. For this to be relevant it would have to be significantly more expensive to the point it would interfere with human access.

mfost 6 days ago | parent | next [-]

I thought the point (which the article misses) is that a token gives you an identity, and an identity can be tracked and rate limited.

So a crawlers that goes very ethically and does very little strain on the server should indeed be able to crawl for a whole week on a cheap compute, one that hammers the server hard will not.

throwawayffffas 5 days ago | parent [-]

Sure but it's really cheap to mint new identities, each node on their scrapping cluster can mint hundreds of thousands of tokens per second.

Provisioning new ips is probably more costly than calculating the tokens, at least with the default difficulty setting.

seba_dos1 6 days ago | parent | prev [-]

...unless you're sus, then the difficulty increases. And if you unleash a single scrapping bot, you're not a problem anyway. It's for botnets of thousands, mimicking browsers on residual connections to make them hard to filter out or rate limit, effectively DDoSing the server.

Perhaps you just don't realize how much did the scraping load increase in the last 2 years or so. If your server can stay up after deploying Anubis, you've already won.

dale_glass 6 days ago | parent [-]

How is it going to hurt those?

If it's an actual botnet, then it's hijacked computers belonging to other people, who are the ones paying the power bills. The attacker doesn't care that each computer takes a long time to calculate. If you have 1000 computers each spending 5s/page, then your botnet can retrieve 200 pages/s.

If it's just a cloud deployment, still it has resources that vastly outstrip a normal person's.

The fundamental issue is that you can't serve example.com slower than a legitimate user on a crappy 10 year old laptop could tolerate, because that starts losing you real human users. So if let's say say user is happy to wait 5 seconds per page at most, then this is absolutely no obstacle to a modern 128 core Epyc. If you make it troublesome to the 128 core monster, then no normal person will find the site usable.

throwawayffffas 6 days ago | parent | next [-]

It's not really hijacked computers, there is a whole market for vpns with residential exit nodes.

The way i think it works is they provide free VPN to the users or even pay their internet bill and then sell the access to their ip.

The client just connects to a vpn and has a residential exit IP.

The cost of the VPN is probably higher than the cost for the proof of work though.

seba_dos1 6 days ago | parent | prev [-]

> How is it going to hurt those?

In an endless cat-and-mouse game, it won't.

But right now, it does, as these bots tend to be really dumb (presumably, a more competent botnet user wouldn't have it do an equivalent of copying Wikipedia by crawling through its every single page in the first place). With a bit of luck, it will be enough until the bubble bursts and the problem is gone, and you won't need to deploy Anubis just to keep your server running anymore.

shkkmo 6 days ago | parent | prev | next [-]

The explanation of how the estimate is made is more detailed, but here is the referenced conclusion:

>> So (11508 websites * 2^16 sha256 operations) / 2^21, that’s about 6 minutes to mine enough tokens for every single Anubis deployment in the world. That means the cost of unrestricted crawler access to the internet for a week is approximately $0.

>> In fact, I don’t think we reach a single cent per month in compute costs until several million sites have deployed Anubis.

kbelder 6 days ago | parent | next [-]

If you use one solution to browse the entire site, you're linking every pageload to the same session, and can then be easily singled out and blocked. The idea that you can scan a site for a week by solving the riddle once is incorrect. That works for non-abusers.

shkkmo 5 days ago | parent [-]

Well, since they can get a unique token for every site every 6 minutes only using a free GCP VPS that doesn't really matter, scraping can easily be spread out across tokens or they can cheaply and quickly get a new one whenever the old one gets blocked.

hiccuphippo 6 days ago | parent | prev | next [-]

Wasn't sha256 designed to be very fast to generate? They should be using bcrypt or something similar.

throwawayffffas 6 days ago | parent [-]

Unless they require a new token for each new request or every x minutes or something it won't matter.

And as the poster mentioned if you are running an AI model you probably have GPUs to spare. Unlike the dev working from a 5 year old Thinkpad or their phone.

_flux 6 days ago | parent [-]

Apparently bcrypt has design that makes it difficult to accelerate effectively on a GPU.

Indeed a new token should be requested per request; the tokens could also be pre-calculated, so that while the user is browsing a page, the browser could calculate tickets suitable to access the next likely browsing targets (e.g. the "next" button).

The biggest downside I see is that mobile devices would likely suffer. Possible the difficulty of the challange is/should be varied by other metrics, such as the number of requests arriving per time unit from a C-class network etc.

debugnik 6 days ago | parent | prev [-]

That's a matter of increasing the difficulty isn't it? And if the added cost is really negligible, we can just switch to a "refresh" challenge for the same added latency and without burning energy for no reason.

Retr0id 6 days ago | parent | next [-]

If you increase the difficulty much beyond what it currently is, legitimate users end up having to wait for ages.

debugnik 6 days ago | parent [-]

And if you don't increase it, crawlers will DoS the sites again and legitimate users will have to wait until the next tech hype bubble for the site to load, which is the reason why software like Anubis is being installed in the first place.

shkkmo 6 days ago | parent [-]

If you triple the difficulty, the cost of solving the PoW is still neglible to the crawlers but you've harmed real users even more.

The reason why anubis works is not the PoW, it is that the dev time to implement the bypass takes out the lowest effort bots. Thus the correct response is to keep the PoW difficulty low so you minimize harm to real users. Or better yet, implementing your own custom check that doesn't use any PoW and relies on ever higher obscurity to block the low effort bots.

The more anubis is used, the less effective it is and the more it harms real users.

therein 6 days ago | parent | prev | next [-]

I am guessing you don't realize that that means people using not the latest generation phones will suffer.

debugnik 6 days ago | parent [-]

I'm not using the latest generation of phones, not in the slightest, and I don't really care, because the alternative to Anubis-like intersitials is the sites not loading at all when they're mass-crawled to death.

6 days ago | parent | prev [-]
[deleted]
dcminter 6 days ago | parent | prev [-]

> Sadly the site's being hugged to death right now

Luckily someone had already captured an archive snapshot: https://archive.ph/BSh1l

skydhash 6 days ago | parent | prev | next [-]

It's more about the (intentional?) DDoS from AI scrappers, than preventing them from accessing the content. Bandwidth is not cheap.

unclad5968 6 days ago | parent | prev | next [-]

Im not on Firefox or any Firefox derivative and I still get anime cat girls making sure I'm not a bot.

nemomarx 6 days ago | parent [-]

Mozilla is used in the user agent string of all major browsers for historical reasons, but not necessarily headless ones or so on.

unclad5968 6 days ago | parent [-]

Oh that's interesting, I had no idea.

seabrookmx 6 days ago | parent [-]

There's some sites[1] that can print your user agent for you. Try it in a few different browsers and you will be surprised. They're honestly unhinged.. I have no idea why we still use this header in 2025!

[1]: https://dnschecker.org/user-agent-info.php

6 days ago | parent | prev | next [-]
[deleted]
dingnuts 6 days ago | parent | prev [-]

[flagged]

verteu 6 days ago | parent | next [-]

> PoW increases the cost for the bots which is great. Trivial to implement, sure, but that added cost will add up quickly.

No, the article estimates it would cost less than a single penny to scrape all pages of 1,000,000 distinct Anubis-guarded websites for an entire month.

thunderfork 6 days ago | parent [-]

Once you've built the system that lets you do that, maybe. You still have to do that, though, so it's still raising the cost floor.

vmttmv 6 days ago | parent [-]

but... how? when the author ran the numbers, the rough estimate is solving the challenges at a rate of 10000/5 min, on a single instance of the free tier of google compute. that is an insignificant load at an even more insignificant cost.

guappa 6 days ago | parent [-]

That's insanely slow compared to how fast they normally scrape.

verteu 5 days ago | parent [-]

As mentioned in the article, mining one token gets you unfettered access for 7 days.

thunderfork 5 days ago | parent [-]

[dead]

userbinator 6 days ago | parent | prev | next [-]

I thought HN was anti-copyright and anti-imaginary-property, or at least the bulk of its users were. Yet all of a sudden, "but AI!!!!1"?

a federal crime

The rest of the world doesn't care.

klabb3 6 days ago | parent [-]

> I thought HN was anti-copyright

Maybe. But what’s happening is ”copyright for thee not for me”, not a universal relaxation of copyright. This loophole exploitation by behemoths doesn’t advance any ideological goals, it only inflames the situation because now you have an adversarial topology. You can see this clearly in practice – more and more resources are going into defense and protection of data than ever before. Fingerprinting, captchas, paywalls, login walls, etc etc.

altairprime 6 days ago | parent | prev | next [-]

Don’t forget signed attestations from “user probably has skin in the game” cloud providers like iCloud (already live in Safari and accepted by Cloudflare, iirc?) — not because they identify you but because abusive behavior will trigger attestation provider rate limiting and termination of services (which, in Apple’s case, includes potentially a console kill for the associated hardware). It’s not very popular to discuss at HN but I bet Anubis could add support for it regardless :)

https://datatracker.ietf.org/wg/privacypass/about/

https://www.w3.org/TR/vc-overview/

shkkmo 6 days ago | parent | prev | next [-]

> PoW increases the cost for the bots which is great.

But not by any meaningful amount as explained in the article. All it actually does is rely on it's obscurity while interfering with legitimate use.

nialv7 6 days ago | parent | prev [-]

> Fuck AI scrapers, and fuck all this copyright infringement at scale.

Yes, fuck them. Problem is Anubis here is not doing the job. As the article already explains, currently Anubis is not adding a single cent to the AI scrappers' costs. For Anubis to become effective against scrappers, it will necessarily have to become quite annoying for legitimate users.

Gibbon1 6 days ago | parent [-]

Best response to AI scrapers is to poison their models.

nemomarx 6 days ago | parent [-]

how well is modern poisoning holding up?

CursedSilicon 6 days ago | parent | next [-]

I'll tell you in a second. First I wanna try adding gasoline to my spaghetti as suggested by Google's search

snerbles 6 days ago | parent [-]

A balanced diet of hydrocarbons in your carbohydrates!

dale_glass 6 days ago | parent | prev | next [-]

To the best of my knowledge, it never really worked.

Yes, it probably works in the lab, in carefully picked conditions, but in the wild I've yet to see any effect whatsoever. Nobody in the AI communities seems to be complaining about it, models keep getting better, and people even intentionally trained on poisoned images just to show it can be done.

IMO on the long end it's a complete dead end of a strategy. Models are many, poisoning can't target everything at once. Even effective poisoning can be just dealt with by finding the algorithm that doesn't care about it.

codedokode 6 days ago | parent | prev [-]

What about appealing to ethics, i.e. posting messages about how a poor catgirl ended up on the street because AI took her job? To make AI refuse to reply due to ethical concerns?

guappa 6 days ago | parent | prev | next [-]

We all know it's doomed

balamatom 6 days ago | parent [-]

That's called a self-fulfilling prophecy and is not in fact mandatory to participate in.

guappa 6 days ago | parent [-]

I'm not making any git commits to remove it…

balamatom 6 days ago | parent [-]

Probably talking about different doomed things then, sorry.

NelsonMinar 6 days ago | parent | prev [-]

¡Nyah!