Who invented deep residual learning?

6 days ago

Copy Link

Modern AI is based on deep artificial neural networks (NNs).
The most cited scientific article of the 21st century is an NN paper on deep residual learning with residual connections.
1991: Sepp Hochreiter introduced recurrent residual connections to solve the vanishing gradient problem.
1997 LSTM: Plain recurrent residual connections (weight 1.0) became a defining feature of Long Short-Term Memory (LSTM).
1999 LSTM: Gated recurrent residual connections (gates initially open: 1.0) were introduced, allowing the network to learn when to reset itself.
2005: Unfolding LSTM transformed recurrent residual connections into feedforward residual connections in deep feedforward NNs (FNNs).
May 2015: Highway Networks introduced gated feedforward residual connections, enabling very deep FNNs with hundreds of layers.
Dec 2015: ResNet emerged as an open-gated variant of Highway Nets, effectively a feedforward version of the 1997 LSTM.
Residual connections must have a weight of 1.0 to neutralize the vanishing/exploding gradient problem.
Highway Nets and ResNets perform similarly on tasks like ImageNet, with Highway Nets being more flexible due to their gating mechanism.
The principles of residual connections are central to both deep RNNs and FNNs, dating back to 1991.

Hasty Briefsbeta