| ▲ | codemog 19 hours ago | ||||||||||||||||
Interesting. I see papers where researchers will finetune models in the 7 to 12b range and even beat or be competitive with frontier models. I wish I knew how this was possible, or had more intuition on such things. If anyone has paper recommendations, I’d appreciate it. | |||||||||||||||||
| ▲ | stavros 18 hours ago | parent [-] | ||||||||||||||||
They're using a revolutionary new method called "training on the test set". | |||||||||||||||||
| |||||||||||||||||