Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/WillKoehrsen/feature-selector
Feature selector is a tool for dimensionality reduction of machine learning datasets
https://github.com/WillKoehrsen/feature-selector
Last synced: 2 months ago
JSON representation
Feature selector is a tool for dimensionality reduction of machine learning datasets
- Host: GitHub
- URL: https://github.com/WillKoehrsen/feature-selector
- Owner: WillKoehrsen
- License: gpl-3.0
- Created: 2018-06-20T21:14:24.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-06-17T22:44:11.000Z (7 months ago)
- Last Synced: 2024-11-11T18:02:54.681Z (2 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 4.95 MB
- Stars: 2,229
- Watchers: 97
- Forks: 768
- Open Issues: 42
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-data-science-resources - Feature Selector: Simple Feature Selection in Python
- awesome-data-science-resources - Feature Selector: Simple Feature Selection in Python
- awesome-machine-learning-resources - **[Library - selector?style=social) (Table of Contents)
README
# Feature Selector: Simple Feature Selection in Python
Feature selector is a tool for dimensionality reduction of machine learning datasets.
# Methods
There are five methods used to identify features to remove:
1. Missing Values
2. Single Unique Values
3. Collinear Features
4. Zero Importance Features
5. Low Importance Features## Usage
Refer to the [Feature Selector Usage notebook](https://github.com/WillKoehrsen/feature-selector/blob/master/Feature%20Selector%20Usage.ipynb) for how to use
## Visualizations
The `FeatureSelector` also includes a number of visualization methods to inspect
characteristics of a dataset.__Correlation Heatmap__
![](images/example_collinear_heatmap.png)
__Most Important Features__
![](images/example_top_feature_importances.png)
Requires:
```
python==3.6+
lightgbm==2.1.1
matplotlib==2.1.2
seaborn==0.8.1
numpy==1.22.0
pandas==0.23.1
scikit-learn==0.19.1
```## Contact
Any questions can be directed to [email protected]!