Feature Extraction with KNN
16 days ago
- #KNN
- #Feature Extraction
- #Machine Learning
- The fastknn package provides a function for feature extraction using KNN, generating k * c new features based on distances between observations and their k nearest neighbors within each class.
- The feature extraction process uses an n-fold CV approach to avoid overfitting and supports parallelization via the nthread parameter.
- The technique is inspired by the winner solution of the Otto Group Product Classification Challenge on Kaggle.
- An example demonstrates that KNN features can capture non-linear information that linear models like GLM cannot, improving accuracy from 83.81% to 95.24%.
- Additional examples with chess and spirals datasets show how KNN features can transform the original space to make classes linearly separable.
- The knnExtract() function is showcased in a Kaggle Kernel for large datasets, highlighting its practical application.