Hasty Briefsbeta

Bilingual

Double Descent Demystified: size of smallest non-zero singular value of X

a year ago
  • #double descent
  • #overparameterization
  • #machine learning
  • Double descent is a phenomenon in machine learning where test error decreases as model parameters increase beyond the number of data points, contrary to classical overfitting theory.
  • The behavior of test loss in double descent depends on data size, dimensionality, and model parameters.
  • The paper provides an intuitive explanation of double descent using polynomial regression and linear algebra.
  • Three interpretable factors are identified that must all be present for double descent to occur.
  • Double descent is demonstrated on real data with ordinary linear regression and shown to disappear when any of the three factors are removed.
  • The findings help explain recent observations in nonlinear models regarding superposition and double descent.
  • Code related to the research is publicly available.