| ▲ | Fabricio20 a day ago | |||||||
One thing I see noone asking, is this not a case of optimization? Hidden reasoning means they dont need to process the output of all that, it stays internal within the model. Less cost for them -> less cost for us (even if they benefit mroe), compared to streaming all of those reasoning tokens out? | ||||||||
| ▲ | j4k0bfr a day ago | parent [-] | |||||||
My understanding was that thinking still gets encrypted, shared with clients, and reingested by Anthropic with each new prompt [1]. Which means it would cost more than normal tokens, since it has to be decrypted/encrypted with every transaction. [1] https://blog.cryptographyengineering.com/2026/05/29/fooling-... Edit: other comments under this post seem to indicate that thinking tokens are cached on the server side as well? I'm a bit confused. | ||||||||
| ||||||||