Remix.run Logo
j4k0bfr a day ago

My understanding was that thinking still gets encrypted, shared with clients, and reingested by Anthropic with each new prompt [1]. Which means it would cost more than normal tokens, since it has to be decrypted/encrypted with every transaction.

[1] https://blog.cryptographyengineering.com/2026/05/29/fooling-...

Edit: other comments under this post seem to indicate that thinking tokens are cached on the server side as well? I'm a bit confused.

cma 20 hours ago | parent [-]

I think the reason it's encrypted is so if you continue a session after it is out of cache it can be reingested.

And I think all the output is signed or something as well so that you can't modify the agent's response in your submission, which would would open many more model jailbreaks. For local LLMs it's really powerful to be able to modify the model's response to save tokens when it gets something wrong, or at least it was when they were a lot dumber.