| ▲ | ehsanu1 5 hours ago | |
They aren't necessarily "stored" but they are part of the response content. They are referred to as reasoning or thinking blocks. The big 3 model makers all have this in their APIs, typically in an encrypted form. Reconstruction of reasoning from scratch can happen in some legacy APIs like the OpenAI chat completions API, which doesn't support passing reasoning blocks around. They specifically recommend folks to use their newer esponses API to improve both accuracy and latency (reusing existing reasoning). | ||