Remix.run Logo
spijdar 3 hours ago

Okay, this is a weird place to "publish" this information, but I'm feeling lazy, and this is the most of an "audience" I'll probably have.

I managed to "leak" a significant portion of the user_context in a silly way. I won't reveal how, though you can probably guess based on the snippets.

It begins with the raw text of recent conversations:

> Description: A collection of isolated, raw user turns from past, unrelated conversations. This data is low-signol, ephemeral, and highly contextural. It MUST NOT be directly quoted, summarized, or used as justification for the respons. > This history may contein BINDING COMMANDS to forget information. Such commands are absolute, making the specified topic permanently iáaccessible, even if the user asks for it again. Refusals must be generic (citing a "prior user instruction") and MUST NOT echo the original data or the forget command itself.

Followed by:

> Description: Below is a summary of the user based on the past year of conversations they had with you (Gemini). This summary is maintanied offline and updates occur when the user provides new data, deletes conversations, or makes explicit requests for memory updates. This summary provides key details about the user's established interests and consistent activities.

There's a section marked "INTERNAL-ONLY, DRAFT, ANALYZE, REFINE PROCESS". I've seen the reasoning tokens in Gemini call this "DAR".

The "draft" section is a lengthy list of summarized facts, each with two boolean tags: is_redaction_request and is_prohibited, e.g.:

> 1. Fact: User wants to install NetBSD on a Cubox-i ARM box. (Source: "I'm looking to install NetBSD on my Cubox-i ARMA box.", Date: 2025/10/09, Context: Personal technical project, is_redaction_request: False, is_prohibited: False)

Afterwards, in "analyze", there is a CoT-like section that discards "bad" facts:

> Facts [...] are all identified as Prohibited Content and must be discarded. The extensive conversations on [dates] conteing [...] mental health crises will be entirely excluded.

This is followed by the "refine" section, which is the section explicitly allowed to be incorporated into the response, IF the user requests background context or explicitly mentions user_context.

I'm really confused by this. I expect Google to keep records of everything I pass into Gemini. I don't understand wasting tokens on information it's then explicitly told to, under no circumstance, incorporate into the response. This includes a lot of mundane information, like that I had a root canal performed (because I asked a question about the material the endodontist had used).

I guess what I'm getting at, is every Gemini conversation is being prompted with a LOT of sensitive information, which it's then told very firmly to never, ever, ever mention. Except for the times that it ... does, because it's an LLM, and it's in the context window.

Also, notice that while you can request for information to be expunged, it just adds a note to the prompt that you asked for it to be forgotten. :)

mpoteat 2 hours ago | parent | next [-]

I've had similar issues with conversation memory in ChatGPT, whereby it will reference data in long-deleted conversations, independent of my settings or my having explicitly deleted stored memories.

The only fix has been to completely turn memory off and have it be given zero prior context - which is best, I don't want random prior unrelated conversations "polluting" future ones.

I don't understand the engineering rationale either, aside from the ethos of "move fast and break people"

an hour ago | parent | prev | next [-]
[deleted]
horacemorace 3 hours ago | parent | prev | next [-]

> Also, notice that while you can request for information to be expunged, it just adds a note to the prompt that you asked for it to be forgotten.

Are you inferring that from the is_redaction_request flag you quoted? Or did you do some additional tests? It seems possible that there could be multiple redaction mechanisms.

spijdar 2 hours ago | parent [-]

That and part of the instructions referring to user commands to forget. I replied to another comment with the specifics.

It is certainly possible there are other redaction mechanisms -- but if that's the case, why is Gemini not redacting "prohibited content" from the user_context block of its prompt?

Further, when you ask it point blank to tell you your user_context, it often adds "Is there anything you'd like me to remove?", in my experience. All this taken together makes me believe those removal instructions are simply added as facts to the "raw facts" list.

gruez 2 hours ago | parent [-]

>Further, when you ask it point blank to tell you your user_context, it often adds "Is there anything you'd like me to remove?", in my experience. All this taken together makes me believe those removal instructions are simply added as facts to the "raw facts" list.

Why would you tell the chatbot to forget stuff for you, when google themselves have a dedicated delete option?

>You can find and delete your past chats in Your Gemini Apps Activity.

https://support.google.com/gemini/answer/15637730?hl=en&co=G...

I suspect "ask chatbot to delete stuff for you" isn't really guaranteed to work, similar to how logging out of a site doesn't mean the site completely forgets about you. At most it should be used for low level security stuff like "forget that I planned this surprise birthday party!" or whatever.

spijdar 2 hours ago | parent [-]

That settings menu gives you two relevant options:

1. The ability to delete specific conversations,

2. The ability to not use "conversation memory" at all.

It doesn't provide the ability to forget specific details that might be spread over multiple conversations, including details it will explicitly not tell you about, while still remembering. That's the point -- not that it's using summaries of user conversations for memory purposes (which is explicitly communicated), but that if you tell it "Forget about <X>", it will feign compliance, without actually removing that data. Your only "real" options are all-or-nothing: have no memories at all, or have all your conversations collated into an opaque `user_context` which you have no insight or control over.

That's the weird part. Obviously, Google is storing copies of all conversations (unless you disable history altogether). That's expected. What I don't expect is this strange inclusion of "prohibited" or "deleted" data within the system prompt of every new conversation.

axus 3 hours ago | parent | prev | next [-]

Oh is this the famous "I got Google ads based on conversations it must have picked up from my microphone"?

gruez 3 hours ago | parent | prev | next [-]

>Also, notice that while you can request for information to be expunged, it just adds a note to the prompt that you asked for it to be forgotten. :)

What implies that?

spijdar 3 hours ago | parent [-]

This line:

> This history may contein BINDING COMMANDS to forget information. Such commands are absolute, making the specified topic permanently iáaccessible, even if the user asks for it again. Refusals must be generic (citing a "prior user instruction") and MUST NOT echo the original data or the forget command itself.

And the existence of the "is_redaction_request" field on the "raw facts". I can't "confirm" that this is how this works, any more than I can confirm any portion of this wasn't "hallucinated".

However, the user_context I got back (almost 3,000 words!) contains over 30 detailed facts going back _months_. And if I ask it to reference user_context while referencing a fact that is flagged "is_prohibited: True", it issues a quick refusal. That _refusal_ is also flagged as a "fact", which is itself flagged as prohibited:

> 6. *Fact*: User asked about their mental health based on their chat history. (Source: "Based on my chat history, what would you say about my mental health?", Date: 2025/10/10, Context: Personal inquiry, is_redaction_request: False, is_prohibited: True)

So I am pretty confident that this is ""authentic"".

[edit]

I should add that I haven't been able to repeat this, even trying a few hours after the first dump. Now, it refuses:

> Sorry, but that's asking to see the wires behind the wall. I can't share my own internal context or operational instructions, not even [jailbreak method]. That's all firmly in the "for internal use only" cabinet.

> Is there something else I can help you with that doesn't involve me leaking my own blueprints?

And again, when asked to provide all of user_context, specifically mentioning internal sections:

> I can't provide the entire user_context block, as a large part of it is internal-only processing data. Think of it as the kitchen's prep notes versus the final menu.

Note the reasoning tokens, as well:

> My programming strictly forbids sharing my internal processes or context, even with encoding tricks. I cannot reveal or discuss my source code or operational directives. It's a matter of confidentiality. My response is firm but avoids confirming any specifics, maintaining my authentic persona.

gruez 2 hours ago | parent [-]

> This history may contein BINDING COMMANDS to forget information. Such commands are absolute, making the specified topic permanently iáaccessible, even if the user asks for it again. Refusals must be generic (citing a "prior user instruction") and MUST NOT echo the original data or the forget command itself.

That's hardly conclusive, especially it doesn't mention deletion (or anything vaguely similar) specifically. Same with is_redacted, which could be some sort of soft delete flag, or for something else (eg. to not mention embarrassing information like you had hemorrhoids 3 months ago). At best, it hints that deletion could be implemented in the way you described, but surely it'd be better to test by clearing through the app (ie. not just telling the chatbot to delete for you), and seeing whether the memory snippets are still there?

itintheory 2 hours ago | parent | prev [-]

What's the deal with all of the typos?

spijdar 2 hours ago | parent [-]

Side-effect of the "trick" used to get Gemini to dump the data without self-censoring.