| ▲ | FiberBundle 6 hours ago | |
I'm kind of sceptical about the altruistic motives here. Giving this to open source maintainers also solves the problem of identifying high quality feedback/rewards for their rlvr models. With everybody using Claude code it might be difficult for them to find a robust way to tell apart good reward signal from mediocre or below average feedback. | ||