Hasty Briefsbeta

Bilingual

µPC: Scaling Predictive Coding to 100 Layer Networks

a year ago
  • #deep learning
  • #predictive coding
  • #machine learning
  • The paper introduces $μ$PC, a method to scale predictive coding (PC) to deep networks with 100+ layers.
  • It addresses the challenge of training deep PC networks (PCNs) by using a Depth-$μ$P parameterisation.
  • The study reveals pathologies in standard PCNs that hinder training at large depths and shows how $μ$PC mitigates some of these issues.
  • $μ$PC enables stable training of deep residual networks (up to 128 layers) with competitive performance on classification tasks.
  • The method allows zero-shot transfer of learning rates across different network widths and depths.
  • The findings have implications for other local learning algorithms and potential extensions to convolutional and transformer architectures.
  • Code for $μ$PC is made available as part of a JAX library for PCNs.