µPC: Scaling Predictive Coding to 100 Layer Networks
a year ago
- #deep learning
- #predictive coding
- #machine learning
- The paper introduces $μ$PC, a method to scale predictive coding (PC) to deep networks with 100+ layers.
- It addresses the challenge of training deep PC networks (PCNs) by using a Depth-$μ$P parameterisation.
- The study reveals pathologies in standard PCNs that hinder training at large depths and shows how $μ$PC mitigates some of these issues.
- $μ$PC enables stable training of deep residual networks (up to 128 layers) with competitive performance on classification tasks.
- The method allows zero-shot transfer of learning rates across different network widths and depths.
- The findings have implications for other local learning algorithms and potential extensions to convolutional and transformer architectures.
- Code for $μ$PC is made available as part of a JAX library for PCNs.