▲ | grumbelbart2 2 days ago | |
> What contributed more towards success in my opinion are "shortcut connections" through layers which enable more influence on early layers during learning. For those who don't know, that is the idea behind ResNet (He et al., Deep Residual Learning for Image Recognition, https://arxiv.org/abs/1512.03385), one of the most influential papers in deep learning of all time. Residual connections make it possible to train networks that are arbitrarily deep. Before ResNet, networks that were too deep were essentially not trainable due to vanishing or exploding gradients. |