Who invented deep residual learning?
6 days ago
- #Deep Learning
- #Residual Connections
- #Neural Networks
- Modern AI is based on deep artificial neural networks (NNs).
- The most cited scientific article of the 21st century is an NN paper on deep residual learning with residual connections.
- 1991: Sepp Hochreiter introduced recurrent residual connections to solve the vanishing gradient problem.
- 1997 LSTM: Plain recurrent residual connections (weight 1.0) became a defining feature of Long Short-Term Memory (LSTM).
- 1999 LSTM: Gated recurrent residual connections (gates initially open: 1.0) were introduced, allowing the network to learn when to reset itself.
- 2005: Unfolding LSTM transformed recurrent residual connections into feedforward residual connections in deep feedforward NNs (FNNs).
- May 2015: Highway Networks introduced gated feedforward residual connections, enabling very deep FNNs with hundreds of layers.
- Dec 2015: ResNet emerged as an open-gated variant of Highway Nets, effectively a feedforward version of the 1997 LSTM.
- Residual connections must have a weight of 1.0 to neutralize the vanishing/exploding gradient problem.
- Highway Nets and ResNets perform similarly on tasks like ImageNet, with Highway Nets being more flexible due to their gating mechanism.
- The principles of residual connections are central to both deep RNNs and FNNs, dating back to 1991.