▲ | justusm 2 days ago | |
nice! Training models using reward signals for code correctness is obviously very common; I'm very curious to see how good things can get using a reward signal obtained from visual feedback | ||
▲ | grace77 2 days ago | parent [-] | |
As are we, seems like the natural next step |