| ▲ | LarsDu88 3 hours ago | |||||||
I disagree with this. Reinforcement learning with verifiable rewards training is actually the secret sauce that is leading Claude and GPT to automating software engineering tasks. All the easily verifiable domains such as mathematics, coding, and things that can be run inside a reasonable simulation are falling very very fast. By next year if not sooner, mathematicians will be wildly outpaced by LLMs for reasoning. | ||||||||
| ▲ | Alex_L_Wood an hour ago | parent | next [-] | |||||||
Coding is anything but “easily” verifiable. | ||||||||
| ||||||||
| ▲ | 2 hours ago | parent | prev [-] | |||||||
| [deleted] | ||||||||