Remix.run Logo
nostrebored 10 hours ago

Could we not get the same with EAFT? Maybe that’s what it’s doing but definitely not the first to think “let’s lock in high probability solutions”

In nemotron the high perplexity solutions are selected for RL, in VLM training a few people are looking at the entropy distributions of the training set, etc