Remix.run Logo
beepbooptheory 3 hours ago

I get how this is a trueism now but I never really understood why it would be useful to scrape cc/codex sessions for training. The relative amount of human input for that is so low (isn't that why they are so loved and used?), how could it actually be useful to them? Wouldn't you wanna focus on people not using it?

swiftcoder 2 hours ago | parent | next [-]

It's more useful as a set of feedback on the model results. You can do sentiment analysis on the user responses to see if they found the model results useful/frustrating/etc and use that to guide future training

vb-8448 2 hours ago | parent | prev [-]

Because you provide them with the "problem" and the "solution" and once you have both you can scale your RL pipeline.