An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with feature-selection

A curated list of projects in awesome lists tagged with feature-selection .

https://github.com/yimeng-zhang/feature-engineering-and-feature-selection

A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.

data-mining feature-engineering feature-extraction feature-selection machine-learning python

Last synced: 16 May 2025

https://github.com/Yimeng-Zhang/feature-engineering-and-feature-selection

A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.

data-mining feature-engineering feature-extraction feature-selection machine-learning python

Last synced: 06 May 2025

https://github.com/nvidia-merlin/nvtabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

deep-learning feature-engineering feature-selection gpu machine-learning nvidia preprocessing recommendation-system recommender-system

Last synced: 12 Jan 2026

https://github.com/NVIDIA-Merlin/NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

deep-learning feature-engineering feature-selection gpu machine-learning nvidia preprocessing recommendation-system recommender-system

Last synced: 01 May 2025

https://github.com/ashishpatel26/amazing-feature-engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

data-analysis data-mining data-science data-scientists data-visualization deep-learning feature-engineering feature-extraction feature-scaling feature-selection features machine-learning scikit-learn

Last synced: 16 May 2025

https://github.com/ashishpatel26/Amazing-Feature-Engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

data-analysis data-mining data-science data-scientists data-visualization deep-learning feature-engineering feature-extraction feature-scaling feature-selection features machine-learning scikit-learn

Last synced: 10 Apr 2025

https://github.com/autoviml/featurewiz

Use advanced feature engineering strategies and select best features from your data set with a single line of code. Created by Ram Seshadri. Collaborators welcome.

autoencoders best-encoders categorical-variables feature-encoding feature-engg feature-engineering feature-extraction feature-selection featuretools mrmr rfe rfecv xgboost

Last synced: 14 May 2025

https://github.com/AutoViML/featurewiz

Use advanced feature engineering strategies and select best features from your data set with a single line of code. Created by Ram Seshadri. Collaborators welcome.

best-encoders categorical-variables feature-engg feature-engineering feature-extraction feature-selection featuretools rfe rfecv xgboost

Last synced: 18 Jul 2025

https://github.com/akanz1/klib

Easy to use Python library of customized functions for cleaning and analyzing data.

data-analysis data-cleaning data-preprocessing data-science data-visualization feature-selection klib python

Last synced: 01 Feb 2026

https://github.com/desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 22 Nov 2025

https://github.com/epistasislab/scikit-rebate

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

data-science feature-selection python

Last synced: 16 May 2025

https://github.com/EpistasisLab/scikit-rebate

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

data-science feature-selection python

Last synced: 27 Mar 2025

https://github.com/Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 03 Apr 2025

https://github.com/kaushalshetty/FeatureSelectionGA

Feature Selection using Genetic Algorithm (DEAP Framework)

deap feature-selection genetic-algorithm machine-learning python

Last synced: 16 Apr 2025

https://github.com/anujdutt9/feature-selection-for-machine-learning

Methods with examples for Feature Selection during Pre-processing in Machine Learning.

correlation feature-selection machine-learning python36

Last synced: 06 Apr 2025

https://github.com/anujdutt9/Feature-Selection-for-Machine-Learning

Methods with examples for Feature Selection during Pre-processing in Machine Learning.

correlation feature-selection machine-learning python36

Last synced: 06 May 2025

https://github.com/upgini/upgini

Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

automated-feature-engineering automl automl-pipeline chatgpt data-enrichment data-science feature-engineering feature-extraction feature-selection features kaggle kaggle-solution large-language-models llm machine-learning open-data open-datasets public-data python-library scikit-learn

Last synced: 15 May 2025

https://github.com/jalajthanaki/nlpython

This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"

deep-learning feature-engineering feature-extraction feature-selection natural-language-processing parsing part-of-speech python-scripting-language python2 text-mining

Last synced: 17 Nov 2025

https://github.com/jalajthanaki/NLPython

This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"

deep-learning feature-engineering feature-extraction feature-selection natural-language-processing parsing part-of-speech python-scripting-language python2 text-mining

Last synced: 19 Jul 2025

https://github.com/solegalli/feature-selection-for-machine-learning

Code repository for the online course Feature Selection for Machine Learning

data-science feature-selection machine-learning python

Last synced: 15 May 2025

https://github.com/jaswinder9051998/zoofs

zoofs is a python library for performing feature selection using a variety of nature-inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics-based to Evolutionary. It's easy to use , flexible and powerful tool to reduce your feature size.

evolutionary-algorithms feature-selection genetic-algorithm grey-wolf grey-wolf-optimizer machine-learning machine-learning-algorithms machinelearning optimization optimization-algorithms optimization-methods optimization-tools particle-swarm particle-swarm-optimization python subset-selection supervised-learning

Last synced: 21 Oct 2025

https://github.com/predict-idlab/powershap

A power-full Shapley feature selection method.

data-science feature-selection machine-learning shap

Last synced: 07 Jul 2025

https://github.com/microsoft/finnts

Microsoft Finance Time Series Forecasting Framework (FinnTS) is a forecasting package that utilizes cutting-edge time series forecasting and parallelization on the cloud to produce accurate forecasts for financial data.

business data-science feature-selection finance finnts forecasting machine-learning microsoft r r-package rstats time-series

Last synced: 15 May 2025

https://github.com/dominance-analysis/dominance-analysis

This package can be used for dominance analysis or Shapley Value Regression for finding relative importance of predictors on given dataset. This library can be used for key driver analysis or marginal resource allocation models.

classification-model dominance dominance-analysis dominance-statistics feature-engineering feature-importance feature-selection keydrivers logistic-regression multiple-regression predictor predictor-importance pseudo-r-square r-square regression-models relative-importance shapley-value

Last synced: 01 May 2025

https://github.com/ctlab/itmo_fs

Feature selection library in python

feature-selection machine-learning

Last synced: 30 Oct 2025

https://github.com/ctlab/ITMO_FS

Feature selection library in python

feature-selection machine-learning

Last synced: 08 May 2025

https://github.com/ajayarunachalam/msda

Library for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector

anamoly-detection-using-graphs anomaly-detection correlation data-analysis deep-learning deep-neural-networks explainable-artificial-intelligence feature-engineering feature-selection multidimensional-data multisensor python pytorch sensor sensor-data signal-processing tabular-data time-series variation visualization

Last synced: 05 Oct 2025

https://github.com/runopti/stg

Python/R library for feature selection in neural nets. ("Feature selection using Stochastic Gates", ICML 2020)

classification cox-model feature-selection neural-networks regression

Last synced: 14 Jan 2026

https://github.com/Superzchen/iLearnPlus

iLearnPlus is the first machine-learning platform with both graphical- and web-based user interface that enables the construction of automated machine-learning pipelines for computational analysis and predictions using nucleic acid and protein sequences.

automated-modelling bioinformatics-tool biomedical-data-analytics deep-learning feature-selection machine-learning prediction python sequence-analysis

Last synced: 21 Jul 2025

https://github.com/urbslab/streamline

Simple Transparent End-To-End Automated Machine Learning Pipeline for Supervised Learning in Tabular Binary Classification Data

automl-pipeline binary-classification data-science data-visualization feature-selection imputation machine-learning model-application statistical-analysis supervised-learning

Last synced: 12 Jul 2025

https://github.com/craigacp/feast

A FEAture Selection Toolbox for C/C+, Java, and Matlab/Octave.

c feature-selection java matlab

Last synced: 13 Jul 2025

https://github.com/fidelity/selective

[AMAI 2024] Selective: Feature Selection Library

feature-selection supervised-feature-selection unsupervised-feature-selection

Last synced: 09 Mar 2026

https://github.com/mi2-warsaw/fselectorrcpp

Rcpp (free of Java/Weka) implementation of FSelector entropy-based feature selection algorithms with a sparse matrix support

entropy feature-selection r rcpp sparse-matrix

Last synced: 27 Mar 2026

https://github.com/epistasislab/rebate

Relief Based Algorithms of ReBATE implemented in Python with Cython optimization. This repository is no longer being updated. Please see scikit-rebate.

cython data-science feature-selection

Last synced: 16 Apr 2025

https://github.com/mamba413/ball

Statistical Inference and Sure Independence Screening via Ball Statistics

ball-correlation ball-covariance ball-divergence feature-selection independence-tests k-sample-test sure-independence-screening

Last synced: 03 Jul 2025

https://github.com/kwokhing/yandexcatboost-python-demo

Demo on the capability of Yandex CatBoost gradient boosting classifier on a fictitious IBM HR dataset obtained from Kaggle. Data exploration, cleaning, preprocessing and model tuning are performed on the dataset

catboost data-analysis data-preprocessing data-science feature-selection gradient-boosting gradient-boosting-classifier one-hot-encode pandas pearson-correlation python python27 seaborn variance-analysis visualization yandex-catboost

Last synced: 09 Apr 2025

https://github.com/sebastianament/compressedsensing.jl

Contains a wide-ranging collection of compressed sensing and feature selection algorithms. Examples include matching pursuit algorithms, forward and backward stepwise regression, sparse Bayesian learning, and basis pursuit.

basis-pursuit compressed-sensing feature-selection julia matching-pursuit sparse-bayesian-learning sparse-linear-systems sparse-regression sparsity stepwise-regression subset-selection

Last synced: 30 Jul 2025

https://github.com/unnir/cancelout

CancelOut is a special layer for deep neural networks that can help identify a subset of relevant input features for streaming or static data.

deep-learning feature-importance feature-selection

Last synced: 21 Sep 2025

https://github.com/medoidai/skrobot

skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.

artificial-intelligence data-science feature-engineering feature-selection hyperparameter-tuning machine-learning model-evaluation model-selection model-training model-tuning open-source predictive-modelling python scikit-learn

Last synced: 02 Aug 2025

https://github.com/somjit101/predictive-maintenance-industrial-iot

Illustrating a typical Predictive Maintenance use case in an Industrial IoT Scenario. By using Statistical Modelling and Data Visualization we attempt to performance Failure Analysis and Prediction of crucial industrial equipments like Boilers, Pumps, Motors etc. so that necessary actions can be taken by the management for their repair, servicing and optimal performance.

csv-files decision-trees eda failure-prediction feature-selection iot ipython-notebook predictive-maintenance python sensors

Last synced: 28 Oct 2025

https://github.com/anaxagor/applybn

Multi-purpose data analysis framework based on Bayesian networks and Causal models

bayesian-networks causal-models concept-analysis feature-extraction feature-selection outlier-detection oversampling tabular-data time-series

Last synced: 26 Feb 2026

https://github.com/Oxid15/cascade

Lightweight and modular MLOps library targeted at small teams or individuals

experiment-tracking feature-selection machine-learning ml ml-experimentation mlops model-lifecycle model-selection

Last synced: 11 May 2025

https://github.com/mlr-org/mlr3filters

Filter-based feature selection for mlr3

feature-selection filter filters mlr mlr3 r r-package variable-importance

Last synced: 09 Apr 2025

https://github.com/dunnkers/fseval

Benchmarking framework for Feature Selection and Feature Ranking algorithms 🚀

automl benchmarking benchmarking-framework benchmarks feature-rankers feature-ranking feature-selection hydra machine-learning python scikit-learn wandb

Last synced: 13 Apr 2025

https://github.com/mamba413/bess

Best Subset Selection algorithm for Regression, Classification, Count, Survival analysis

classification-model feature-selection poisson-regression regression-models sparse-linear-systems survival-analysis variable-selection

Last synced: 21 Mar 2025

https://github.com/jupiters1117/mico

MICO: Mutual Information and Conic Optimization for feature selection

conic-programs convex-optimization feature-selection machine-learning mutual-information python semidefinite-programming

Last synced: 14 Jan 2026

https://github.com/habedi/feature-factory

A high-performance feature engineering library for Rust powered by Apache DataFusion 🦀

data-preprocessing data-science feature-engineering feature-selection machine-learning rust-lang rust-library

Last synced: 01 Aug 2025

https://github.com/majianthu/aps2020

Code for the paper 'Variable Selection with Copula Entropy' published on Chinese Journal of Applied Probability and Statistics

copula-entropy distance-correlation feature-engineering feature-selection hsic mutual-information variable-importance variable-selection

Last synced: 11 Jun 2025

https://github.com/genfifth/cvopt

Machine learning's parameter search and feature selection module which is integrated log management and visualization.

bayesian-optimization deep-learning feature-selection hyperopt hyperparameter-optimization integrated-visualization keras logmanagement machine-learning python scikit-learn

Last synced: 10 Apr 2025

https://github.com/deezer/interpretable_nn_attribution

Source code from our RecSys 2020 paper: "Making neural network interpretable with attribution: application to implicit signals prediction" (D. Afchar, R. Hennequin)

attribution deezer feature-selection interpretable-machine-learning recsys2020

Last synced: 25 Oct 2025

https://github.com/joaquinamatrodrigo/seleccion_predictores_ga_python

Selección de predictores mediante algoritmo genético python

feature-selection genetic-algorithm machine-learning python

Last synced: 13 Jul 2025

https://github.com/cumbof/chopin2

Domain-Agnostic Supervised Learning with Hyperdimensional Computing

apache-spark backward-elimination feature-selection gpgpu hd-computing machine-learning supervised-learning vsa

Last synced: 14 Jun 2025

https://github.com/iwhalen/tblup

Trait BLUP, a Feature Selection Package for Genomic Prediction

differential-evolution evolutionary-computation feature-selection genomic-prediction

Last synced: 29 May 2026

https://github.com/teddyoweh/dimensionality-reduction-pca

Dimensionality reduction is basically a process of reducing the amount of random features,attributes variables or in this case called dimensions in a dataset and leaving as much variation in the dataset as possible by obtaining a set of only relevant features to increase the effiency of a model.

data-science dataset dimensional-analysis dimensionality-reduction feature-extraction feature-selection machine-learning

Last synced: 09 Apr 2025

https://github.com/edikedik/eboruta

Flexible and transparent Python Boruta implementation

ensemble-models feature-selection machine-learning python scikit-learn

Last synced: 10 Apr 2025