| ▲ | Evaluating chain-of-thought monitorability(openai.com) | |
| 22 points by mfiguiere 3 days ago | 2 comments | ||
| ▲ | ursAxZA 5 minutes ago | parent | next [-] | |
I might be missing something here as a non-expert, but isn’t chain-of-thought essentially asking the model to narrate what it’s “thinking,” and then monitoring that narration? That feels closer to injecting a self-report step than observing internal reasoning. | ||
| ▲ | ramoz an hour ago | parent | prev [-] | |
> Our expectation is that combining multiple approaches—a defense-in-depth strategy—can help cover gaps that any single method leaves exposed. Implement hooks in codex then. | ||