Remix.run Logo
mathisfun123 2 hours ago

> why do neural networks work better than other models

The only people for whom this is an open question are the academics - everyone else understands it's entirely because of the bagillions of parameters.

hodgehog11 2 hours ago | parent | next [-]

No it isn't, and it's frustrating when the "common wisdom" tries to boil it down to this. If this was true, then the models with "infinitely many" parameters would be amazing. What about just training a gigantic two-layer network? There is a huge amount of work trying to engineer training procedures that work well.

The actual reason is due to complex biases that arise from the interaction of network architectures and the optimizers and persist in the regime where data scales proportionally to model size. The multiscale nature of the data induces neural scaling laws that enable better performance than any other class of models can hope to achieve.

skydhash 2 hours ago | parent [-]

> The actual reason is due to complex biases that arise from the interaction of network architectures and the optimizers and persist in the regime where data scales proportionally to model size. The multiscale nature of the data induces neural scaling laws that enable better performance than any other class of models can hope to achieve.

That’s a lot of words to say that, if you encode a class of things as numbers, there’s a formula somewhere that can approximate an instance of that class. It works for linear regression and works as well for neural network. The key thing here is approximation.

bubblyworld an hour ago | parent [-]

That isn't what they are saying at all, lol.

tacet 2 hours ago | parent | prev [-]

Also massive human work done on them, that wasn't done before.

Data labeling is pretty big industry in some countries and I guess dropping 200 kilodollars on labeling is beyond the reach of most academics, even if they would not care about ethics of that.