https://github.com/cod3licious/autofeat
Linear Prediction Model with Automated Feature Engineering and Selection Capabilities
https://github.com/cod3licious/autofeat
automated-data-science automated-feature-engineering automated-machine-learning automl feature-engineering feature-selection linear-regression machine-learning machine-learning-models
Last synced: 7 months ago
JSON representation
Linear Prediction Model with Automated Feature Engineering and Selection Capabilities
- Host: GitHub
- URL: https://github.com/cod3licious/autofeat
- Owner: cod3licious
- License: mit
- Created: 2019-01-22T14:20:16.000Z (about 7 years ago)
- Default Branch: main
- Last Pushed: 2025-03-23T15:16:34.000Z (11 months ago)
- Last Synced: 2025-07-03T04:40:56.629Z (8 months ago)
- Topics: automated-data-science, automated-feature-engineering, automated-machine-learning, automl, feature-engineering, feature-selection, linear-regression, machine-learning, machine-learning-models
- Language: Python
- Homepage:
- Size: 1 MB
- Stars: 520
- Watchers: 19
- Forks: 64
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-automl - cod3licious/autofeat
README
# Autofeat
**Autofeat** is a Python library that provides `sklearn`-compatible linear prediction models with automated feature engineering and selection capabilities.
## Overview
Autofeat simplifies the process of improving linear model performance by automating feature generation and selection. It first generates a wide range of non-linear features, then selects a small, robust subset of meaningful features that enhance the predictive power of linear models. This multi-step approach allows you to harness the interpretability of linear models without sacrificing accuracy.
### Key Features:
- **Automated Feature Generation and Selection**: Automates the process of generating and selecting features for linear models for improved performance.
- **Improved Performance and Interpretability**: The generated features improve prediction accuracy while retaining the intuitive interpretability of linear models.
- **Seamless Integration**: Fully compatible with `scikit-learn` pipelines, making it easy to integrate into your existing machine learning workflows.
### Use Cases:
- Ideal for **supervised learning tasks** where model transparency is crucial for decision-making.
- Suitable for **feature selection** in large datasets, automating the discovery of important variables.
- Useful in scenarios where **non-linear features** need to be discovered and leveraged without complicating the model.
**Note:** The code is intended for research purposes. Results may vary depending on the dataset and use case.
## Installation
Autofeat is available on PyPI, making it easy to install via `pip`:
```
pip install autofeat
```
### Other Dependencies
- numpy
- pandas
- scikit-learn
- sympy
- joblib
- pint
- numba
## Documentation and Resources
| Description | Link |
|-------------|------|
| Example Notebooks | [examples](/notebooks/) |
| Documentation | [documentation](https://franziskahorn.de/autofeat) |
| Paper | [paper](https://arxiv.org/abs/1901.07329) |
| Talk | [PyData talk](https://www.youtube.com/watch?v=4-4pKPv9lJ4) |
If any of this code was helpful for your work, please consider citing the paper:
```
@inproceedings{horn2019autofeat,
title={The autofeat Python Library for Automated Feature Engineering and Selection},
author={Horn, Franziska and Pack, Robert and Rieger, Michael},
booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
pages={111--120},
year={2019},
organization={Springer}
}
```
If you have any questions please don't hesitate to send me an [email](mailto:cod3licious@gmail.com) and of course if you should find any bugs or want to contribute other improvements, pull requests are very welcome!
## Acknowledgments
This project was made possible thanks to support by [BASF](https://www.basf.com).