Remix.run Logo
Show HN: Unicode Steganography(steganography.patrickvuscan.com)
45 points by PatrickVuscan a day ago | 11 comments

I built a demo of two Unicode steganography techniques, zero-width characters and homoglyph substitution, in the context of AI misalignment.

The first is about the use of two invisible zero-width characters (ZWS and ZWNJ) to binary encode text.

The second is much cooler. Most characters in the Latin and Cyrillic alphabets look nearly identical, but have different unicode. If you have text to encode and convert it into binary representation (1s and 0s), you could take plain english "carrier" text and for each 1 in the binary representation you could substitute the Cyrillic letter equivalent. Decoding the message requires traversing the text and seeing where Cyrillic letters could have been substituted but weren't, and where they were, leading to 0s and 1s respectively, which can be built back into your original hidden text.

In both cases, these are detectable, but the interesting problem for me is whether an LLM could eventually invent an encoding that goes unnoticed by both us, and automated detection.

If LLMs were able to covertly include messages in plaintext, misaligned AI Agents could eventually communicate across MCP/A2A and individual chat session boundaries undetected. A deceptive LLM might seem helpful, but work against your goals. It could tell other agents it interacts with over MCP/A2A to help it discreetly fail, signal intent, and avoid tripping oversight/safety mechanisms. Furthermore, oversight mechanisms become more difficult to implement if we can't believe our own eyes.

Edit Apr 8, 2026: One comment brought up the use of variational selectors as another encoding technique. I updated the website to showcase that as another one of the techniques!

bo1024 a day ago | parent | next [-]

Cool stuff. I think there have been projects recently that use LLMs to encode messages in plain text by manipulating the choices of output tokens. Someone with the same version of the LLM can decode. Note sure where to find these projects though.

PatrickVuscan a day ago | parent | next [-]

Wow, just found it: https://news.ycombinator.com/item?id=43030436 thanks for bringing this up, gave me some good reading material for tonight!

adzm 4 hours ago | parent [-]

I created something similar a long long time ago, but much simpler, using markov chains. Basically just encoding data via the choice of the next word tuple given the current word tuple. It generated gibberish mostly, but was fun 25 years ago

nurple 3 hours ago | parent | prev | next [-]

This is a really interesting space, and one that I've been playing with since the first GPTs landed. But it's even cooler than simply using completion choice to encode data. It has been mathematically proven that you can use LLMs to do stego that cannot be detected[0]. I'm more than positive that comments on social media are being used to build stego dead drops.

What I find really interesting about this approach is that it's one of the less obvious ways LLMs might be used by the general public to defend themselves against the LLM capabilities used by bad actors (like the more obvious LLMs making finding bugs easier is good for blackhats, but maybe better for whitehats), i.e semantic search.

The reasoning in my head being that it creates a statistical firewall that would preclude eaves-droppers with privileged access from being able to use cheap statistical methods to detect a hidden message (which is effectively what crypto _is_, ipso facto this is effectively undetectable crypto).

ETA, the abstract for a paper I've been working on related to this:

Mass surveillance systems have systematically eroded the practical security of private communication by eliminating channel entropy through universal collection and collapsing linguistic entropy through semantic indexing. We propose a protocol that reclaims these lost "bits of security" by using steganographic text generation as a transport layer for encrypted communication. Building on provably secure generative linguistic steganography (ADG), we introduce conversation context as implicit key material, per-message state ratcheting, and automated heartbeat exchanges to create a system where the security properties strengthen over time and legitimate users enjoy constant-cost communication while adversaries face costs that scale with the entire volume of global public text. We further describe how state-derived proofs can establish a novel form of Web of Trust where relationship depth is cryptographically verifiable. The result is a communication architecture that is structurally resistant to mass surveillance rather than merely computationally resistant.

0. https://arxiv.org/abs/2106.02011

gorgoiler 2 hours ago | parent | prev [-]

Wow, thaHt’s soELP interestiIMng… weA wouLIld lovVEe toTR heaAPPr morEDe aboINut thaUSEAST1t topic!

(With apologies to Mr Justice P. Smith, sort of: https://en.wikipedia.org/wiki/Smithy_code )

mpoteat a day ago | parent | prev | next [-]

You can actually do better: hint - variational selectors, low bytes.

PatrickVuscan 7 hours ago | parent | next [-]

I went down the rabbit hole last night, and found some great resources on variational selectors. Thanks for the inspiration, I added a demo of this to the site as well!

dezgeg 2 hours ago | parent | prev [-]

Switching between NFC vs NFD could be even sneakier.

sixhobbits 18 hours ago | parent | prev | next [-]

There are a bunch of invisible characters that I used to build something similar a while back, pre LLMs, to hide state info in telegram messages to make bots more powerful

https://github.com/sixhobbits/unisteg

sjdv1982 3 hours ago | parent | prev | next [-]

If I understand correctly, this is like the WW2 enigma machines: a single black box to both encode and decode?

QuiCasseRien 2 hours ago | parent | prev [-]

awesome !!