| ▲ | antirez 12 hours ago | |
Another potentially usable trick is the following: based on the observation that longer token budget improves model performances, one could generate solutions using a lot of thinking budget, then ask the LLM to turn the trace into a more compact one, and later SFT on that. That said, I have the feeling the result of the paper will likely be hard to apply in practice without affecting other capabilities, and/or not superior to other techniques that provide similar improvement in sampling. | ||