Double Descent Demystified: size of smallest non-zero singular value of X
a year ago
- #double descent
- #overparameterization
- #machine learning
- Double descent is a phenomenon in machine learning where test error decreases as model parameters increase beyond the number of data points, contrary to classical overfitting theory.
- The behavior of test loss in double descent depends on data size, dimensionality, and model parameters.
- The paper provides an intuitive explanation of double descent using polynomial regression and linear algebra.
- Three interpretable factors are identified that must all be present for double descent to occur.
- Double descent is demonstrated on real data with ordinary linear regression and shown to disappear when any of the three factors are removed.
- The findings help explain recent observations in nonlinear models regarding superposition and double descent.
- Code related to the research is publicly available.