Remix.run Logo
OpenAI: Investigating the consequences of accidentally grading CoT during RL(alignment.openai.com)
2 points by pretext 11 hours ago