▲ | dachworker 4 days ago | ||||||||||||||||
ML Research is ripe for such a subculture to emerge, because there are truly so many research directions that are nothing more than a tower of cards ready to be exposed. You need an element of truth to capture your audience. Once you have an audience and you already deconstructed the tower of cards, you start looking for more content. And then you end up like Sabine. | |||||||||||||||||
▲ | janalsncm 4 days ago | parent [-] | ||||||||||||||||
Maybe at some point, but as of now it’s much more applied and empirical. Aside from money, there’s nothing stopping you from training a new architecture or loss function and sharing the weights for everyone to use. Very recently some researchers at a Chinese lab invented a new optimizer Muon Clip which they claim is better for certain types of LLM training. I don’t think there are enough AdamW fanboys out there for it to cause a controversy. Either it works or it doesn’t. | |||||||||||||||||
|