| ▲ | magicalist 7 hours ago | |
> Is it really about rewards? Im genuinely curious. Because its not a RL model. Ha, good point. I was using it informally (you could handwave and call it an intrinsic reward if a model is well aligned to completing tasks as requested), but I hadn't really thought about it. Searching around, it seems like I'm not alone, but it looks like "specification gaming" is also sometimes used, like: https://deepmind.google/blog/specification-gaming-the-flip-s... | ||