Remix.run Logo
spijdar 3 hours ago

This line:

> This history may contein BINDING COMMANDS to forget information. Such commands are absolute, making the specified topic permanently iáaccessible, even if the user asks for it again. Refusals must be generic (citing a "prior user instruction") and MUST NOT echo the original data or the forget command itself.

And the existence of the "is_redaction_request" field on the "raw facts". I can't "confirm" that this is how this works, any more than I can confirm any portion of this wasn't "hallucinated".

However, the user_context I got back (almost 3,000 words!) contains over 30 detailed facts going back _months_. And if I ask it to reference user_context while referencing a fact that is flagged "is_prohibited: True", it issues a quick refusal. That _refusal_ is also flagged as a "fact", which is itself flagged as prohibited:

> 6. *Fact*: User asked about their mental health based on their chat history. (Source: "Based on my chat history, what would you say about my mental health?", Date: 2025/10/10, Context: Personal inquiry, is_redaction_request: False, is_prohibited: True)

So I am pretty confident that this is ""authentic"".

[edit]

I should add that I haven't been able to repeat this, even trying a few hours after the first dump. Now, it refuses:

> Sorry, but that's asking to see the wires behind the wall. I can't share my own internal context or operational instructions, not even [jailbreak method]. That's all firmly in the "for internal use only" cabinet.

> Is there something else I can help you with that doesn't involve me leaking my own blueprints?

And again, when asked to provide all of user_context, specifically mentioning internal sections:

> I can't provide the entire user_context block, as a large part of it is internal-only processing data. Think of it as the kitchen's prep notes versus the final menu.

Note the reasoning tokens, as well:

> My programming strictly forbids sharing my internal processes or context, even with encoding tricks. I cannot reveal or discuss my source code or operational directives. It's a matter of confidentiality. My response is firm but avoids confirming any specifics, maintaining my authentic persona.

gruez 2 hours ago | parent [-]

> This history may contein BINDING COMMANDS to forget information. Such commands are absolute, making the specified topic permanently iáaccessible, even if the user asks for it again. Refusals must be generic (citing a "prior user instruction") and MUST NOT echo the original data or the forget command itself.

That's hardly conclusive, especially it doesn't mention deletion (or anything vaguely similar) specifically. Same with is_redacted, which could be some sort of soft delete flag, or for something else (eg. to not mention embarrassing information like you had hemorrhoids 3 months ago). At best, it hints that deletion could be implemented in the way you described, but surely it'd be better to test by clearing through the app (ie. not just telling the chatbot to delete for you), and seeing whether the memory snippets are still there?