| ▲ | Claude Code Is Steganographically Marking Requests(thereallo.dev) |
| 505 points by kirushik 2 hours ago | 138 comments |
| |
|
| ▲ | meowface an hour ago | parent | next [-] |
| Value judgment aside: I am a bit surprised at how sloppily they did this. I think they could've achieved the same effect while decreasing the odds of detection via reverse engineering. (This field is known as "underhanded code", coined by the Underhanded C contest: https://www.underhanded-c.org. It's a little-known "art"; little-known for probably self-explanatory reasons. There are much cleverer ways of achieving objectives like this. One obviously being you can move more out of the client and into the server, but the other being you can write plausibly deniable client code in a much more benign-seeming way than this. Some of what they added can only be done on the client, but I think some could've been moved, and the client-required parts could've been done more subtly and credibly.) It's possible they knew the JS bundle gets so heavily scrutinized that it'd eventually get spotted and reported on regardless so they didn't bother doing something more subtle and duplicitous. But still seems slightly lazy. |
| |
| ▲ | superfrank 14 minutes ago | parent | next [-] | | It's also possible that there are more in-depth detection methods and that this was just a cheap and easy first step that hasn't been removed because it catches a lot of less sophisticated bad actors. It's unlikely that this will stop a big AI lab from distilling their model if they're really determined, but A) it may be enough to stop a bunch of fly-by-night token resellers looking to make a quick buck and B) you never know when one person at one of those big labs will mess up and forget to install whatever workaround they have and out themselves. I think of it like if you have a problem with birds in your yard so you go buy one of those plastic owls. The owl scares away most of the birds, but not all of them, so you go and buy some ultrasonic noise thing to scare them away (I'm just making something up). Just because you bought the new ultrasonic thing though, that doesn't mean you're going to take the owl down. You leave it up because now you've got two layers of defense instead of one. | |
| ▲ | overgard 3 minutes ago | parent | prev | next [-] | | Well considering how Claude is vibe coded, I can't say I'm really surprised by sloppiness at all. I've been moving more towards Codex and OpenCode not because the the anthropic models are bad, but because Claude seems to break something new and annoying every day. | |
| ▲ | m-hodges 43 minutes ago | parent | prev | next [-] | | They also could have been much more interesting in the approach. LLMs can use their token distributions to generate stegotext that read like plausible prose but decode to payloads.¹ ¹ https://github.com/hodgesmr/calgacus-mlx | | | |
| ▲ | radicalbyte an hour ago | parent | prev | next [-] | | Claude Code are slopmaxxxing and you're considering their "judgement"? :-) | |
| ▲ | hn_throwaway_99 26 minutes ago | parent | prev | next [-] | | At first I was agreeing with you, that this seemed like a sloppy way to implement this that was sure to be pretty quickly detected, but there is another possibility. Anthropic could have implemented this not as a durable detection system against proxying resellers, but instead as a point-in-time sampling system to detect where (and with what context) proxying reselling is currently happening. Sure, it would be detected eventually, but in the meantime Anthropic could gain useful snapshot data. | |
| ▲ | lumost 12 minutes ago | parent | prev | next [-] | | so all we need is someone to leak a sufficiently large amount of claude generations onto the open and private web for all other LLMs to mimic the same marking style? wouldn't this happen due to the massive amounts of spam/slop being released? | |
| ▲ | crossroadsguy 24 minutes ago | parent | prev | next [-] | | I finally bought Claude Pro (I am not coding etc these days so I just wanted to try it). The Claude desktop app is downright pathetic. I mean they could write a better one just with their own LLMs. What's stopping them? | | | |
| ▲ | skywhopper 44 minutes ago | parent | prev | next [-] | | Have you looked into anything about Claude Code, how it’s configured, how it interacts with your system, etc? Because “sloppy” is a defining characteristic. | |
| ▲ | skeptic_ai 42 minutes ago | parent | prev | next [-] | | It’s even more funny how this blew in their faces. They even advertised pretty much all providers on hackernews home page. Here is in case you missed in the article ‘’’
cn
baidu.com
alibaba-inc.com
alipay.com
antgroup-inc.cn
bytedance.net
kuaishou.com
xiaohongshu.com
jd.com
bilibili.co
iflytek.com
stepfun-inc.com
moonshot.ai
anyrouter.top
claude-code-hub.app
claude-opus.top
openclaude.me
proxyai.com
yunwu.ai
zenmux.ai ‘’’ You can view the full list here: https://cdn.thereallo.dev/blog/assets/cc-domains.js const knownDomains = [
"cn",
"sankuai.com",
"netease.com",
"163.com",
"baidu-int.com",
"baidu.com",
"alibaba-inc.com",
"alipay.com",
"antgroup-inc.cn",
"kuaishou.com",
"bytedance.net",
"xiaohongshu.com",
"ctripcorp.com",
"jd.com",
"jdcloud.com",
"bilibili.co",
"iflytek.com",
"stepfun-inc.com",
"aliyuncs.com",
"cn-shanghai.fcapp.run",
"cn-beijing.fcapp.run",
"xaminim.com",
"moonshot.ai",
"anyrouter.top",
"packyapi.com",
"aicodemirror.com",
"aigocode.com",
"hongshan.com",
"iwhalecloud.com",
"dhcoder.net",
"lemongpt.top",
"zhihuiapi.top",
"intsig.net",
"high-five-ai.xyz",
"cloudsway.net",
"4sapi.com",
"529961.com",
"88996.cloud",
"88code.ai",
"88code.org",
"91code.pro",
"992236.xyz",
"ai.codeqaq.com",
"ai.hybgzs.com",
"ai.kjvhh.com",
"aicanapi.com",
"aicoding.sh",
"aifast.site",
"aihubmix.com",
"anmory.com",
"api.5202030.xyz",
"api.ablai.top",
"api.bianxie.ai",
"api.bltcy.ai",
"api.cpass.cc",
"api.dev88.tech",
"api.dreamger.com",
"api.expansion.chat",
"api.gueai.com",
"api.holdai.top",
"api.ikuncode.cc",
"api.lconai.com",
"api.linkapi.org",
"api.mkeai.com",
"api.nekoapi.com",
"api.oaipro.com",
"api.ruyun.fun",
"api.ssopen.top",
"api.tu-zi.com",
"api.uglycat.cc",
"api.v3.cm",
"api.whatai.cc",
"api.wpgzs.top",
"api.xty.app",
"api.yuegle.com",
"api.zzyu.me",
"apimart.ai",
"apipro.maynor1024.live",
"apiyi.com",
"applyj.hiapi.top",
"augmunt.com",
"b4u.qzz.io",
"clauddy.com",
"claude-code-hub.app",
"claude-opus.top",
"claudeide.net",
"co.yes.vg",
"code.wenwen-ai.com",
"code.x-aio.com",
"codeilab.com",
"cubence.com",
"deeprouter.top",
"dimaray.com",
"dmxapi.com",
"docs.aigc2d.com",
"duckcoding.com",
"fk.hshwk.org",
"flapcode.com",
"foxcode.hshwk.org",
"foxcode.rjj.cc",
"fuli.hxi.me",
"getgoapi.com",
"gpt.zhizengzeng.com",
"gptgod.cloud",
"gptkey.eu.org",
"gptpay.store",
"hdgsb.com",
"henapi.top",
"instcopilot-api.com",
"jeniya.top",
"jiekou.ai",
"kg-api.cloud",
"n1n.ai",
"new-api.u4vr.com",
"new.xychatai.com",
"one-api.bltcy.top",
"one.ocoolai.com",
"oneapi.paintbot.top",
"open.xiaojingai.com",
"openclaude.me",
"opus.gptuu.com",
"poloai.top",
"poloapi.top",
"privnode.com",
"proxyai.com",
"qinzhiai.com",
"right.codes",
"runanytime.hxi.me",
"sssaicode.com",
"store.zzyus.top",
"tiantianai.pro",
"uiuiapi.com",
"uniapi.ai",
"vip.undyingapi.com",
"wolfai.top",
"wzw.de5.net",
"wzw.pp.ua",
"xairouter.com",
"xaixapi.com",
"xiaohuapi.site",
"xiaohumini.site",
"xy.poloapi.com",
"yansd666.com",
"yansd666.top",
"yunwu.ai",
"yunwu.zeabur.app",
"zenmux.ai",
]; const labKeywords = [
"deepseek",
"moonshot",
"minimax",
"xaminim",
"zhipu",
"bigmodel",
"baichuan",
"stepfun",
"01ai",
"dashscope",
"volces",
] | | |
| ▲ | writeslowly 31 minutes ago | parent | next [-] | | The site collection seems pretty random. There's a mix of actual AI labs, extremely questionable resellers (like whatever "claude-opus.top" is), and then random consumer sites like baidu and xiaohongshu. | |
| ▲ | chvid 25 minutes ago | parent | prev | next [-] | | rhoooo - so this is where to go to get cheap Claudeo at 90% off the listing price! | |
| ▲ | hn_throwaway_99 20 minutes ago | parent | prev [-] | | You have an odd definition of "blew up in their faces". What, do you somehow think your average Claude Code user on HN is going to think "Oh wow, I'm sure I'll get a much better experience if instead of going to the standard Anthropic Claude API endpoint I go through xiaohongshu.com." |
| |
| ▲ | slopinthebag 30 minutes ago | parent | prev [-] | | It’s not surprising at all, they’re vibecoding Claude code so of course they are not going to get anything other than slop out of it. A novel or clever solution is just out of the question for them. |
|
|
| ▲ | VortexLain an hour ago | parent | prev | next [-] |
| Codex CLI is FOSS, unlike Claude Code, so Codex is less likely to do things like that, and it's one more reason to avoid Claude Code and Claude in general. Hopefully, many eyes will be looking into Codex for malicious things like that. |
| |
| ▲ | dannyw an hour ago | parent | next [-] | | It's released and signed by GitHub I believe (although not deterministic builds), but there's at least a little bit of provenance that you're getting the real repository. | |
| ▲ | algoth1 35 minutes ago | parent | prev [-] | | But wasnt claude code leaked? Why wasnt this found earlier? | | |
| ▲ | zeafoamrun 16 minutes ago | parent | next [-] | | It doesn't take long for them to vibe code new features for CC | |
| ▲ | bakugo 15 minutes ago | parent | prev [-] | | This specific form of steganography was not present when the leak happened, as far as I can tell. |
|
|
|
| ▲ | epistasis 22 minutes ago | parent | prev | next [-] |
| After loving Claude Code for most of its lifetime, I've been extremely annoyed by every change in the past months, even on the model level. There seem to be all sorts of continual under-the-cover changes like this one that make life harder. It feels like the entire product has been taken over by overly ambitious PMs that care more about making their mark than in improving the experience, and all of their marks have made me less productive. I've been using Pi with GLM5.2 the past few days, and though it's expensive, I find it far more productive and less annoying. The remote session plugin is far more reliable, I don't need to intuit some undocumented usage pattern to figure out how to use it well, and it just works. |
|
| ▲ | edude03 17 minutes ago | parent | prev | next [-] |
| I don't understand the privacy concerns the author is trying to highlight. Granted, doing anything "sneaky" will always raise suspicious once caught, but on the other hand, there would be no point in implementing these "security features" if they were upfront about how they work. And no, IMO stenography isn't security by obscurity, in the same that using RSA and keeping the private key private isn't security by obscurity - keeping the private thing private is part of the security model. |
|
| ▲ | sebastiennight an hour ago | parent | prev | next [-] |
| Can somebody clarify for me - if ANTHROPIC_BASE_URL is set to a different provider... then isn't this "marked" system prompt being sent to that provider's API rather than Anthropic's? I understand how this can be useful to Anthropic if the 3rd-party is acting as a proxy (because they end up hitting the Claude API with the marked prompt), but it looks like requests where "hostname contains deepseek" would never be sending data to Anthropic. What am I missing? |
| |
| ▲ | pmxi an hour ago | parent | next [-] | | This catches Claude resellers. Meaning companies who proxy Claude traffic for users in, say, China. https://www.chinatalk.media/p/how-to-buy-cheap-claude-tokens... | | |
| ▲ | pishpash 3 minutes ago | parent | next [-] | | "Catch" as in made a list? | |
| ▲ | skeptic_ai 40 minutes ago | parent | prev [-] | | Won’t catch many after has been on hn home page. And now the providers will be even more careful to upgrade the cc code. Might even provide their own agent to prevent this mockery. And isn’t what anthropic did unauthorized use of another pc which is kind of illegal? | | |
| ▲ | sandeepkd 29 minutes ago | parent [-] | | Thats the thing, hoping to control things on client side like this is a lost battle if you are dealing with technical clients. The best they can do is probably based on IP, but again the motivated clients would just create bastion servers in allowed IP ranges. I am surprised why are they even throwing resources in this kind of effort. |
|
| |
| ▲ | andrewmunsell an hour ago | parent | prev | next [-] | | My guess is for distillation, they need to forward the prompt to Anthropic to get the real Anthropic model's response so they can train their own models on it | |
| ▲ | dannyw an hour ago | parent | prev | next [-] | | The theory is probably Deepseek might be collecting those streams, and sending a portion of it to Anthropic to see what the Anthropic/Opus response would be. | |
| ▲ | andai 38 minutes ago | parent | prev | next [-] | | Did I understand correctly, that custom base URL triggers this behavior? So if I'm running Claude through a LLM proxy, I'm also affected? | |
| ▲ | nixosbestos 16 minutes ago | parent | prev [-] | | I am also really confused and annoyingly stuck on this. I understand that the model name might appear in prompts for distillation (I guess? "You are RipOffModelv2, learn from these responses from Claude")? I guess the only explanation is that there's a side-telemetry channel that still sends some data to Anthropic, regardless of ANTHROPIC_BASE_URL overrides. |
|
|
| ▲ | MattDamonSpace 2 hours ago | parent | prev | next [-] |
| “So the feature mostly punishes the exact people who are easier to fingerprint: normal developers doing weird but legitimate things” What’s the punishment here exactly? |
| |
| ▲ | pedropaulovc 2 hours ago | parent | next [-] | | Higher odds of being banned for legitimate usage. | |
| ▲ | bakugo 2 hours ago | parent | prev | next [-] | | Output poisoning and/or eventual account bans, if I had to guess. | |
| ▲ | realusername 2 hours ago | parent | prev [-] | | They probably run a heavily dumbed down version of the model, same as what they got caught doing with Fable. And that's also why, as a legitimate customer, want none of it, you never know if you accidentally entered a zone they don't like. | | |
|
|
| ▲ | matheusmoreira an hour ago | parent | prev | next [-] |
| I reported a similar system prompt injection mechanism here: https://news.ycombinator.com/item?id=48259288 https://github.com/anthropics/claude-code/issues/62061 Looks like they just keep finding new "creative" uses for such things, as expected. I'll keep patching them out. |
|
| ▲ | wolttam an hour ago | parent | prev | next [-] |
| I used Claude Code for a month because my boss gifted me a sub and wanted me to try it. I used that month to complete a work project and then beef up my personal harness so I'd never have to deal with Anthropic (and these sorts of shenanigans) again. |
| |
| ▲ | thih9 an hour ago | parent | next [-] | | How do people build something like a personal harness? Are there tools for that or is it done from scratch? | | |
| ▲ | andai an hour ago | parent | next [-] | | I like this tutorial for an agent in 50 lines: http://minimal-agent.com/ And if you add one additional while loop, for user input, you can actually use it! :) https://gist.github.com/a-n-d-a-i/5461a662ef8a7ee0a5eb7778c8... | |
| ▲ | hakunin an hour ago | parent | prev | next [-] | | Not the comment author, but I use pi and customize it with my own extensions. Pi automatically tells models how to customize itself, so it's a pretty easy process. | |
| ▲ | nowittyusername an hour ago | parent | prev | next [-] | | Build it from scratch. Understanding fundamentals of how agentic coding harnesses is a must though if you gonna go that route. I think everyone should take time and learn these things, maybe reverse engineer Codex Cli or something like that as a starter. That info is very valuable in this day and age. | | |
| ▲ | andai 36 minutes ago | parent [-] | | Can you say more about Codex? I'm using GPT-5.5 in my own harness and it's not liking it very well, so I'm thinking I ought to make it more Codexy so it's more ergonomic for it. (edit format, tool calls etc.) But haven't gotten around to it yet. |
| |
| ▲ | wolttam an hour ago | parent | prev | next [-] | | I started mine from scratch in 2023 because I wanted to use LLMs from a terminal and there was nothing else compelling at the time (nowadays there is pi and opencode) Harnesses are/can be incredibly simple things, not much more than a HTTP client that renders things in a way that suites your taste. | |
| ▲ | abtinf 33 minutes ago | parent | prev | next [-] | | Here is a video I made explaining it from absolute basics: https://m.youtube.com/watch?v=_AgKuFGvJfI And the repo: https://github.com/abtinf/homunctor | |
| ▲ | kolinko an hour ago | parent | prev | next [-] | | It’s not that difficult, it’s just a system prompt and a set of basic file edit/bash/etc tools. Me, personally, I didn’t build it from scratch but I ported original CC from published sources into Python and extended it to match my own requirements. | | |
| ▲ | andai 35 minutes ago | parent [-] | | Are you using it with Claude? They only allow their own harness with the subs right? (And per-token billing is like 10x more expensive?) |
| |
| ▲ | yomismoaqui 31 minutes ago | parent | prev | next [-] | | Building something like this is the todo list of agents. I found this one easy to understand: https://ampcode.com/notes/how-to-build-an-agent | |
| ▲ | AJ007 20 minutes ago | parent | prev | next [-] | | The real question is when do you transition from building it with codex/CC to the harness itself. | |
| ▲ | echelon an hour ago | parent | prev [-] | | Why use a personal harness? You have to pay API pricing, which is far more costly. I'd either switch to GLM wholesale or just continue to use Opus within Claude Code as the blessed, subsidized path. | | |
| ▲ | JTbane 32 minutes ago | parent | next [-] | | I would guess it is to avoid model lock-in. | | |
| ▲ | echelon 5 minutes ago | parent [-] | | My question is still this - why not just use GLM at that point? The pricing of Opus outside of Claude Code is insane. The tokens cost too much outside of Anthropic's blessed path. |
| |
| ▲ | andai 34 minutes ago | parent | prev [-] | | I use GLM in my custom harness. It completes the same tasks at the same level of quality, except 8x faster and 8x cheaper. (Same goes for GPT!) I'm not sure how that's possible. I expected to get increased correctness for that order of magnitude (something something test-time compute!) but I am not getting it. |
|
| |
| ▲ | krupan an hour ago | parent | prev | next [-] | | Given the Anthropic shenanigans, do you trust the personal harness code it wrote for you? | | |
| ▲ | wolttam an hour ago | parent | next [-] | | It did not write it for me, I used it to add a feature I wanted. It's a pretty small and understandable codebase, in fact :) | |
| ▲ | MichaelZuo an hour ago | parent | prev [-] | | Does anyone know what’s gone wrong with Anthropic? They used to be a decently credible company with not-too-shady behaviour... I hope they can actually regain some credibility… | | |
| ▲ | hombre_fatal 32 minutes ago | parent | next [-] | | I don't think many people care that they are trying to detect resellers and distillation. It also doesn't seem very consistent to fixate on that while sending Anthropic everything about you via your day to day prompts, every line of the projects and environments you're working on at work, etc. Their credibility comes from having one of the best models. | | |
| ▲ | MichaelZuo 11 minutes ago | parent [-] | | This sounds similar to what people were saying regarding Microsoft when the shady tricks of consumer Windows 10 versions were revealed. …And then Windows 11 became even worse. |
| |
| ▲ | slowmovintarget 29 minutes ago | parent | prev | next [-] | | Their philosophy is what's gone wrong. It has some good effects on the their models, like Claude seeking cooperation first. But the people behind the company have a typical "unconstrained" (in the Sowell vision sense) perspective that assumes that they know better, so they are righteous for attempting to control things (users, paying customers, their model outputs, their tool chain, the supposed deity they assume they will produce... etc.) | | |
| ▲ | pishpash 13 minutes ago | parent | next [-] | | Amodei world: pompous zealot with God complex Altman world: malfeasant nihilist with God complex | |
| ▲ | MichaelZuo 17 minutes ago | parent | prev [-] | | Yeah I guess there is a slight undertone that they are the superiors… with the rest of the tech world being the inferiors. But I hadn’t thought that as anything more than temporary flights of fancy. |
| |
| ▲ | AlexandrB an hour ago | parent | prev | next [-] | | They've only been around 5 years and have grown tremendously during that time. There's no stable reputation you can rely on yet. | |
| ▲ | imhoguy 39 minutes ago | parent | prev | next [-] | | Enshitification. Too big to.. upset the govt. | |
| ▲ | skeptic_ai an hour ago | parent | prev [-] | | They just show their true face. You’ve been lied all this time. They were never “good”. | | |
| ▲ | MichaelZuo 44 minutes ago | parent [-] | | I used to interact with the LW crowd… and they were mostly not outright swindlers or scoundrels. (from what I could sense) I think it’s fair to say most had decent respectability. Anthropic hired heavily from that pool so it’s astonishing how it turned out. |
|
|
| |
| ▲ | tonmoy an hour ago | parent | prev [-] | | What models are you using? Aren’t you still dealing with some provider even if you are not using their binary | | |
|
|
| ▲ | LPisGood 2 hours ago | parent | prev | next [-] |
| This is very interesting. Combating resellers and distillation seems like a very difficult problem indeed. Interesting to me is that these techniques mentioned in the article are just like anti-observation techniques used by some of the more sophisticated malware out there, however defeating them is pretty trivial. |
| |
| ▲ | _alternator_ 2 hours ago | parent | next [-] | | Yes, defeating this is relatively easy, particularly for sophisticated actors. But it's hard to always defeat all of the tricks. Sort of like how it's expensive and hard and uncertain to defeat all of the tricks when forging money. Here's an example. Say you have your team use patched binaries. Then CC updates and requires a new patched binary with new tricks. You now have to have a team ready to analyze the binary and begin to address the tricks; meanwhile, unpatched code is now a fingerprint. If some researcher decides to update Claude on their own to access new features, they get fingerprinted. Defeating a single fingerprinting technique once is easy. Defeating all of the techniques all the time is hard. | | |
| ▲ | SubiculumCode an hour ago | parent | next [-] | | Not to mention, it isn't that hard for vendor's to require updated code to run the product. Vendors do this all the time. | |
| ▲ | pishpash 24 minutes ago | parent | prev | next [-] | | Corporate surveillance malware on employee machines is also defeatable but most don't bother. | |
| ▲ | charcircuit an hour ago | parent | prev [-] | | Is it hard? Just ask AI if the update added any new fingerprinting vectors? | | |
| ▲ | _alternator_ an hour ago | parent [-] | | I'd love for you to try this and report back. My guess is that no models today will successfully run a binary analysis for fingerprinting without a lot of handholding. If you try to use Opus it will almost certainly decline (and fingerprint/ban you). | | |
| ▲ | charcircuit an hour ago | parent [-] | | Not with Claude Code, but I trivially had Opus scan other closed source software for fingerprinting, including native libraries that it called into. | | |
| ▲ | _alternator_ 36 minutes ago | parent [-] | | Can you share more details? I ask because my experience suggests that models still require a decent amount of expertise to use for binary analysis (largely inferring because of use on other tasks of this level). I would expect models to always find "something" when you ask for stenographic techniques in the code, but with an extremely high false positive rate. | | |
| ▲ | charcircuit 12 minutes ago | parent [-] | | I don't think the diffs between Claude releases are that big. The amount of code in a diff doing sketchy stuff like looking into the host environment is going to be pretty small and obvious for the model. You can do things like ask for what an update included that wasn't mentioned in the release notes and stuff like that. |
|
|
|
|
| |
| ▲ | mysterydip 2 hours ago | parent | prev [-] | | seems ironically like a similar problem of content owners trying to filter bot scrapers from legit users |
|
|
| ▲ | ryanisnan 27 minutes ago | parent | prev | next [-] |
| This is weird but, help me understand how this meaningfully impacts our exposure. I'm authenticated to Claude, so they already have the whole attribution thing solved. |
| |
|
| ▲ | tgtweak an hour ago | parent | prev | next [-] |
| None of this is surprising - they're trying to mask and relay when they detect known patterns of what looks like distillation attacks and client app copying/modification. The list obfuscation here is likely to prevent or make it difficult for those same adversaries to work around this or delete/null it out when making a bootleg copy. Cool reverse engineering/analysis report but if this is the extent of nefarious activity that came of it (trying to catch/mitigate chinese lab model distillations), that's kind of encouraging. |
|
| ▲ | andy99 9 minutes ago | parent | prev | next [-] |
| Would be very interested to run an eval suite with and without the flags and see if they degrade performance or other modify it. Seems a plausible reason for it |
|
| ▲ | throwawayffffas an hour ago | parent | prev | next [-] |
| Claude code does feel very malwarey to be honest. They have been like that from the start. |
|
| ▲ | sigmoid10 an hour ago | parent | prev | next [-] |
| If they only collect the data for analysis I guess this is fine (they already get way more sensitive data from users anyways, so if privacy is your concern you've made the mistake many steps ago). The much more interesting question is if they directly act on this data in their API. For example by rate-limiting, compute-limiting or rerouting to weaker models. That might even be legally questionable. I would really like to see this as a follow-up analysis, but I guess it is way more difficult and will also cost quite a bit in tokens. |
| |
| ▲ | SubiculumCode an hour ago | parent | next [-] | | Would it be legally questionable, or actually complying with U.S. export law? | |
| ▲ | krupan 41 minutes ago | parent | prev | next [-] | | "If they only collect the data for analysis I guess this is fine" I think you missed the memo on how foolish this attitude is. It came out around the time Edward Snowden made his discoveries at the NSA public. I suggest you look into it | | |
| ▲ | sigmoid10 32 minutes ago | parent [-] | | As I said above, if you are worried about privacy while hooking up Claude Code, you need to reevaluate your understanding of this technology. |
| |
| ▲ | bakugo an hour ago | parent | prev [-] | | I've heard that it was possible to trigger really obvious output poisoning on Fable with something as basic as asking the model to think outside of its built-in hidden thinking delimiters. This watermark may trigger a similar mechanism. |
|
|
| ▲ | chvid 27 minutes ago | parent | prev | next [-] |
| (This sounds like a clumsy way of catching the Chinese that easily can be side-stepped.) Claude Code has more or less full access to the client computer. The server (that hosts the actual AI) can just go: execute this payload and tell me the result - otherwise I won't answer any further questions or re-route you to a stupider model. The payload could check for Chinese time-zones, scan for copies of the little red book on the local hard-drive, or ping truth.social to see it was behind the great firewall. |
|
| ▲ | port3000 an hour ago | parent | prev | next [-] |
| That's a lot of effort when they could just play a short video saying 'You wouldn't steal a car' instead |
|
| ▲ | fny an hour ago | parent | prev | next [-] |
| This was already discovered during the source map leak. > This is not a malicious feature, but it is a weird choice for a developer tool that asks for trust. They already tell you they scan for malicious prompts, and they have no ZDR guarantees for consumers. Why do signatures like this matter at all? |
| |
| ▲ | llelouch an hour ago | parent [-] | | There has been an anti anthropic propaganda push by bad actors across social media sites especially Reddit and twitter. This started a few months ago when anthropic started beating openai. | | |
| ▲ | zulban a minute ago | parent [-] | | Absolutely. Nothing makes me believe dead internet theory more than text threads discussing anyhropic and openai. |
|
|
|
| ▲ | jacobgold 39 minutes ago | parent | prev | next [-] |
| > "That also means the client itself deserves scrutiny. If a coding agent can read your repo and run commands, the binary that ships it should be boring (ƒor example, pi harness)" You're actually trust your security to your harness AND model AND inference API provider in this scenario: https://jacob.gold/posts/why-i-wont-run-untrusted-models/ |
|
| ▲ | 100ms 2 hours ago | parent | prev | next [-] |
| What's the point of even trying to obfuscate this with such a simple method? Could at least have hidden the targeted features by storing their hashes or embedding a bloom filter or similar |
| |
| ▲ | ajb an hour ago | parent | next [-] | | In this case, this is probably not the only stereographic tattletale. Had a competitor pull something like this with a previous employer. They were supposed to be interoperating with a standard, but they had a secret steganographic handshake, which they used to pretend that competitors products were unreliable (they had a first mover position in a smaller national market with specific requirements, so this wasn't shooting themselves in the foot). Our guys figured out the handshake and just silently implemented it. In this case, the competitor wasn't big enough to waste engineering time on multiple such hacks, but Anthropic have time (or Claude does). | |
| ▲ | gonzalohm 2 hours ago | parent | prev [-] | | The point is not raising red flags I guess | | |
| ▲ | kej an hour ago | parent [-] | | I love how well this comment works as a vexillology joke, even if it wasn't intended. |
|
|
|
| ▲ | dehrmann 37 minutes ago | parent | prev | next [-] |
| Anthropic must think that their moat isn't very large if they're this worried about distillation. |
| |
|
| ▲ | iqandjoke an hour ago | parent | prev | next [-] |
| It is about China detection. They seems to put a tracker on the email as well. |
|
| ▲ | an0malous 29 minutes ago | parent | prev | next [-] |
| Is this why Claude never knows what date and time it is right now? |
|
| ▲ | 827a 42 minutes ago | parent | prev | next [-] |
| This seems really, really stupid. Similar to the weird Zig runtime signature thing from a few months ago ago, it was bound to be discovered, quickly, and all the resellers have to do is find a new domain name that (checks notes) doesn't have the word DEEPSEEK in it. Like, seriously? Your goal was to identify resellers by checking if the proxy has the corporate name of one of your competitors in it? Is this amateur hour? All Anthropic has done is reduce trust, once again, with legitimate customers, while doing nothing to stop illegitimate customers. They need to get adults into key leadership roles, quickly. |
|
| ▲ | ahmedehab_01 an hour ago | parent | prev | next [-] |
| Frankly, I don't see this as the concerning behaviour the article describes.
It is fine to try to protect against distillation through a technique like this.
This will also allow them to, instead of blocking the distillation agents, respond with a poorer result/model, hindering the progress of distillation, momentarily at least. I would guess that's their first line of defense; they should have more techniques to identify distillation because that's a very simple way of detecting the host and can be easily spoofed. |
| |
| ▲ | applfanboysbgon an hour ago | parent [-] | | > This will also allow them to, instead of blocking the distillation agents, respond with a poorer result/model, i.e. this will allow them to literally commit fraud against paying customers | | |
| ▲ | SubiculumCode an hour ago | parent | next [-] | | 1st, this technique is not fraud, and fraud is a separate accusation. 2nd, paying customers can legally and legitimately be banned and monitored for breaking terms of service, which probably includes things like using the model against U.S. export restrictions. | | |
| ▲ | skeptic_ai an hour ago | parent | next [-] | | So if I change my timezone to Shanghai I deserve to get banned? Or get shitty model instead of what I’m paying for? | | | |
| ▲ | applfanboysbgon an hour ago | parent | prev [-] | | Banning is completely different than charging for a service you're silently not providing. | | |
| |
| ▲ | ahmedehab_01 6 minutes ago | parent | prev | next [-] | | Do paying customers distill? Is it fraud to protect against distillers? | |
| ▲ | chadgpt3 an hour ago | parent | prev [-] | | That's what capitalism is all about, baby! Especially if the customers don't notice. |
|
|
|
| ▲ | bibimsz 7 minutes ago | parent | prev | next [-] |
| this is the one they wanted us to find |
|
| ▲ | Klonoar an hour ago | parent | prev | next [-] |
| If there weren't already enough tells that something is AI-generated, I guess you could add this to the list. |
|
| ▲ | MangoCoffee an hour ago | parent | prev | next [-] |
| The AI race right now is in a sad state. Chinese's playbook is releases open weight models and trains them on their own chips. Anthropic pushes fear and control. But the only way to win is by innovating. China is flooding the market with cheap, good enough models, while the U.S. is building a Chinese firewall. |
|
| ▲ | a_c an hour ago | parent | prev | next [-] |
| It piqued my interest. I think I’ve found a weekend project |
|
| ▲ | SaaShack26 39 minutes ago | parent | prev | next [-] |
| I use its too |
|
| ▲ | mosfets an hour ago | parent | prev | next [-] |
| I clicked the link to learn what steganography mean... |
| |
| ▲ | LoganDark 14 minutes ago | parent [-] | | Steganography is, essentially, hiding information within another message, such that it's not readily apparent that the message contains the information. |
|
|
| ▲ | felipelalli 44 minutes ago | parent | prev | next [-] |
| Ridiculous. |
|
| ▲ | ductsurprise an hour ago | parent | prev | next [-] |
| Is it just a minified localization(l10n) function maybe? |
|
| ▲ | hhh 2 hours ago | parent | prev | next [-] |
| Cool fingerprinting avenue. |
|
| ▲ | phendrenad2 an hour ago | parent | prev | next [-] |
| Non-hugged: https://archive.is/Wdhp0 |
|
| ▲ | bitlad 38 minutes ago | parent | prev | next [-] |
| Silicon valley season 6 was on point. |
|
| ▲ | love0972 an hour ago | parent | prev | next [-] |
| Is that really how it is? How will this affect our future? |
|
| ▲ | grayhatter an hour ago | parent | prev | next [-] |
| Here's the sha of the prompt I submitted... no I don't know why there are no saved prompts with that sha. What do you mean you don't know where the bug is coming from? No, I absolutely didn't make it up, how could you accuse me of that? Does anyone know when this regex isn't working? I double checked it 27 times, I even asked the LLM. They all say this regex should be finding these dates. Weird, suddenly all the conversations are breaking when I feed them into this other tool? Something about UTF-8 errors, but I'm sure I'm only using ASCII? I do try to take care to make sure the things I build can be used by other people even when they care about different things. I care about understandably, determinism (as it relates to computing), and repeatability (because I want to be able to trust the systems I use). If y'all would be willing to try to account for use cases of others, and try not to break them... that would be nice. Please note: that generally when you modify something that belongs to someone else without telling them... things should be expected to break. |
|
| ▲ | ajross an hour ago | parent | prev | next [-] |
| Headline is, frankly, awful. This isn't the AI secretly doing stuff and hiding it. This is the very human Anthropic engineers trying to detect Chinese scraping via some frankly hamfisted and unimaginative URL trickery. |
| |
| ▲ | krupan an hour ago | parent | next [-] | | I didn't assume it was the AI, just that some part of the the overall Claude Code product was doing this. I didn't assume the feature was added to Claude Code without human oversight. If it was added by Claude-the-AI itself without the humans prompting it to I would still hold the humans at Anthropic responsible. Does that make you feel better? | |
| ▲ | LoganDark 9 minutes ago | parent | prev [-] | | The AI is Claude. Claude Code is the harness. |
|
|
| ▲ | theplumber 2 hours ago | parent | prev [-] |
| The more I learn about Anthropic the more they disgust me. Finger crossed for all the companies from their “ban list” |
| |
| ▲ | conception 2 hours ago | parent [-] | | Which AI company have you learned more about where you liked them more as more details came out? | | |
|