Hasty Briefsbeta

Bilingual

Reproducing the deep double descent paper

a year ago
  • #resnet18
  • #machine-learning
  • #double-descent
  • The author spent time at the Recurse Center to learn machine learning (ML) without prior background.
  • Focused on reproducing results from the 'Deep Double Descent' paper to test understanding.
  • Double descent refers to model performance improving, then worsening, and improving again with increased model size or training duration.
  • Small models (underparameterized) improve with more parameters but can't fully learn the problem.
  • At the interpolation threshold, models memorize training data but perform poorly on test data.
  • Larger models (overparameterized) can learn underlying features well without overfitting.
  • Label noise was introduced to study its effect on double descent.
  • The author attempted to reproduce results using ResNet18 on CIFAR-10, adjusting for image size and output categories.
  • Training challenges included incorrect label noise application and model adjustments for CIFAR-10.
  • Results showed double descent with label noise, matching the paper's findings.
  • Larger models initially performed worse but recovered with more training, especially with higher label noise.