Remix.run Logo
ComplexSystems 3 days ago

I would like to see what happens if some company devoted your resources to just training a model that is a total beast at math. Feed it a ridiculous amount of functional analysis and machine learning papers, and just make the best model possible for this one task. Then instead of trying to make it cheap so everyone can use it, just set it on the task of figuring out something better than the current architecture and literally have it do nothing else but that and make something based on whatever it figures out. Will it come up with something better than AdamW for optimization? Than transformers for approximating a distribution from a random sample? I don't know, but: what is the point of training any other model?