| ▲ | half-kh-hacker 5 hours ago | |
How does post-training via reinforcement learning factor in? Does every evaluated judgement count as 'the training data' ? | ||
| ▲ | abcde666777 4 hours ago | parent | next [-] | |
I guess I'd place both within a broader umbrella: human generated input. So it still holds that they're regurgitating the decisions made by humans. | ||
| ▲ | internet_points 2 hours ago | parent | prev [-] | |
yes | ||