A polynomial autoencoder beats PCA on transformer embeddings

3 days ago

Polynomial autoencoder improves compression of embeddings by adding a quadratic decoder to PCA, capturing nonlinear structure missed by linear methods.
The method uses a closed-form quadratic lift with Ridge OLS regression, eliminating the need for SGD or hyperparameter tuning.
Experiments on BEIR/FiQA show poly-AE reduces the performance gap to raw embeddings significantly, achieving 4× compression with minimal NDCG loss.
Poly-AE consistently outperforms PCA, especially at higher compression levels (e.g., d=128) and on non-matryoshka-trained models.
Limitations include cubic computational cost for large d, transductive fitting requiring corpus statistics, and overfitting on small corpora.
The technique originates from quadratic manifold methods in dynamical systems and is adapted here for neural embeddings.
Future directions include testing on larger datasets, exploring higher-degree polynomials, and hybrid approaches with matryoshka embeddings.

Hasty Briefsbeta