Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/xiaoganghan/awesome-feature-engineering

A curated list of feature engineering techniques for image and text machine learning

deep-learning feature-engineering machine-learning

Last synced: 25 Jun 2024

https://github.com/matheusccouto/autolearn

Automated machine learning.

feature-engineering machine-learning python

Last synced: 24 Jun 2024

https://github.com/databrickslabs/automl-toolkit

Toolkit for Apache Spark ML for Feature clean-up, feature Importance calculation suite, Information Gain selection, Distributed SMOTE, Model selection and training, Hyper parameter optimization and selection, Model interprability.

apache-spark feature-engineering machinelearning ml pyspark scala spark

Last synced: 24 Jun 2024

https://github.com/oskar-j/awesome-auto-ml

Awesome list of AutoML frameworks - curated by @oskar-j

automl awesome-list feature-engineering hyperparameter-tuning machine-learning model-tuning

Last synced: 24 Jun 2024

https://github.com/alibaba/Alink

Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.

apriori classification clustering data-mining feature-engineering flink flink-machine-learning flink-ml fm graph-algorithms graph-embedding kafka machine-learning recommender recommender-system regression statistics word2vec xgboost

Last synced: 23 Jun 2024

https://github.com/aikho/awesome-feature-engineering

A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning

ai data-science feature-engineering feature-extraction machine-learning

Last synced: 22 Jun 2024

https://github.com/apachecn/fe4ml-zh

:book: [译] 面向机器学习的特征工程

book feature-engineering machine-learning python

Last synced: 20 Jun 2024

https://github.com/v1tzor/TimePlanner

Mobile app for planning tasks for the day with multimodule architecture, MVI, Compose, Room, Voyager, AlarmManager, Notification, Charts

alarmmanager android charts clean-architecture compose feature-engineering flow jetpack-compose kotlin kotlin-coroutines material3 mvi mvi-clean-architecture notifications planner-app room unittest voyager

Last synced: 14 Jun 2024

https://github.com/functime-org/functime

Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.

feature-engineering forecasting machine-learning panel-data polars python time-series

Last synced: 07 Jun 2024

https://github.com/alibaba/feathub

FeatHub - A stream-batch unified feature store for real-time machine learning

apache-flink data data-engineering data-quality data-science feature-engineering feature-store machine-learning mlops streaming

Last synced: 07 Jun 2024

https://github.com/rakiiibul/auto_insurance_fraud

Data mining and machine learning libraries are used in this machine learning project to detect the fraud. More importantly, this report focuses on vehicle insurance company claim statistics to use the gathered knowledge from actuarial Science Course.

actuarial-science car-insurance data-mining datascience feature-engineering hyperparameter-tuning insurance-frauds machine-learning outlier-detection r-language rmd

Last synced: 03 Jun 2024

https://github.com/fraunhoferportugal/tsfel

An intuitive library to extract features from time series.

classification colab-notebook data-science feature-engineering feature-extraction time-series

Last synced: 31 May 2024

https://github.com/praktiskt/featuretoolsR

An R interface to the Python module Featuretools

feature-engineering featuretools machine-learning r-package rstats

Last synced: 20 May 2024

https://github.com/eifuentes/awesome-embeddings

🪁A curated list of awesome resources around entity embeddings

awesome awesome-list deep-learning embedding embeddings feature-engineering machine-learning

Last synced: 19 May 2024

https://github.com/upgini/upgini

Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

automated-feature-engineering automl automl-pipeline chatgpt data-enrichment data-science feature-engineering feature-extraction feature-selection features kaggle kaggle-solution large-language-models llm machine-learning open-data open-datasets public-data python-library scikit-learn

Last synced: 18 May 2024

https://github.com/featureform/featureform

The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

data-quality data-science embeddings embeddings-similarity feature-engineering feature-store hacktoberfest machine-learning ml mlops python vector-database

Last synced: 13 May 2024

https://github.com/Yimeng-Zhang/feature-engineering-and-feature-selection

A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.

data-mining feature-engineering feature-extraction feature-selection machine-learning python

Last synced: 13 May 2024

https://github.com/404notf0und/FXY

Security-Scenes-Feature-Engineering-Toolkit, Continuous Integration.一款安全数据特征化工具

data-analysis data-mining feature-engineering machine-learning security security-scenes

Last synced: 12 May 2024

https://github.com/salesforce/TransmogrifAI

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning

ai automated-machine-learning automl dsl einstein estimators feature-engineering features machine-learning ml pipelines salesforce scala spark sparkml structured-data transformations transformers transmogrification transmogrify

Last synced: 09 May 2024

https://github.com/NVIDIA-Merlin/NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

deep-learning feature-engineering feature-selection gpu machine-learning nvidia preprocessing recommendation-system recommender-system

Last synced: 07 May 2024

https://github.com/nikolaydubina/go-featureprocessing

🔥 Fast, simple sklearn-like feature processing for Go

feature-engineering go machine-learning

Last synced: 04 May 2024

https://github.com/IliaZenkov/sklearn-audio-classification

An in-depth analysis of audio classification on the RAVDESS dataset. Feature engineering, hyperparameter optimization, model evaluation, and cross-validation with a variety of ML techniques and MLP

audio audio-data classification deep-learning-tutorial deep-neural-networks dnns emotion emotion-detection emotion-recognition feature-engineering machine-learning machine-learning-tutorials mlp-model model-evaluation ravdess-dataset scikit-learn sklearn

Last synced: 02 May 2024

https://github.com/DAGWorks-Inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hacktoberfest lineage llmops machine-learning mlops numpy orchestration pandas python software-engineering

Last synced: 28 Apr 2024

https://github.com/NITRO-AI/NitroFE

NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

feature feature-engineering features indicator indicators machine-learning time-series timeseries

Last synced: 22 Apr 2024

https://github.com/chrislemke/sk-transformers

A collection of pandas & scikit-learn compatible transformers for preprocessing and feature engineering 🛠

data-science feature-engineering feature-selection machine-learning pandas preprocessing python scikit-learn scikit-learn-pipelines scikit-learn-transformer

Last synced: 22 Apr 2024

https://github.com/Yimeng-Zhang/Machine-Learning-From-Scratch

系统梳理机器学习的各个知识点。

feature-engineering feature-selection machine-learning

Last synced: 22 Apr 2024

https://github.com/Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 21 Apr 2024

https://github.com/stitchfix/hamilton

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

dag data-engineering data-platform data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hamilton hamiltonian machine-learning numpy pandas python software-engineering stitch-fix

Last synced: 20 Apr 2024

https://github.com/jalajthanaki/NLPython

This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"

deep-learning feature-engineering feature-extraction feature-selection natural-language-processing parsing part-of-speech python-scripting-language python2 text-mining

Last synced: 17 Apr 2024

https://github.com/jo-cho/Technical_Analysis_and_Feature_Engineering

Feature Engineering and Feature Importance in Machine Learning for Financial Markets

algorithmic-trading feature-engineering

Last synced: 17 Apr 2024

https://github.com/asavinov/intelligent-trading-bot

Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

algorithmic-trading artificial-intelligence bitcoin crypto crypto-trading cryptocurrency feature-engineering machine-learning trading trading-bots

Last synced: 17 Apr 2024

https://github.com/kozodoi/dptools

Python package with utilities for data processing, aggregation, feature engineering and data versioning

aggregation data-preparation data-preprocessing data-science feature-engineering python

Last synced: 16 Apr 2024

https://github.com/18D070001/Electrical-Devices-Identification-Model

Electrical Devices Identification Model (EDIM) for the identification of electrical devices by analyzing their energy consumption profiles.

colab-notebook electrical feature-engineering feature-extraction machine-learning nilm nilm-algorithms python-3 signal-processing

Last synced: 15 Apr 2024

https://github.com/imgcook/datacook

Machine Learning and Data Analysis in JavaScript.

data-science feature-engineering javascript machine-learning

Last synced: 08 Apr 2024

https://github.com/dominance-analysis/dominance-analysis

This package can be used for dominance analysis or Shapley Value Regression for finding relative importance of predictors on given dataset. This library can be used for key driver analysis or marginal resource allocation models.

classification-model dominance dominance-analysis dominance-statistics feature-engineering feature-importance feature-selection keydrivers logistic-regression multiple-regression predictor predictor-importance pseudo-r-square r-square regression-models relative-importance shapley-value

Last synced: 01 Apr 2024

https://github.com/abhayspawar/featexp

Feature exploration for supervised learning

data-exploration data-science feature-engineering machine-learning visualization

Last synced: 24 Mar 2024

https://github.com/evinism/mistql

A query / expression language for performing computations on JSON-like structures. Tuned for clientside ML feature extraction.

expression-language feature-engineering feature-extraction hacktoberfest javascript json machine-learning mistql python query typescript

Last synced: 22 Mar 2024

https://github.com/google/temporian

Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖

cpp feature-engineering python temporal-data time-series

Last synced: 21 Mar 2024

https://github.com/AutoViML/featurewiz

Use advanced feature engineering strategies and select best features from your data set with a single line of code. Created by Ram Seshadri. Collaborators welcome.

best-encoders categorical-variables feature-engg feature-engineering feature-extraction feature-selection featuretools rfe rfecv xgboost

Last synced: 19 Mar 2024

https://github.com/asavinov/prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

business-intelligence data-preparation data-preprocessing data-processing data-science data-wrangling feature-engineering map-reduce olap pandas python spark workflow

Last synced: 18 Mar 2024

https://github.com/metarank/metarank

A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine

automl data-engineering data-science deep-learning feature-engineering feature-extraction kubernetes machine-learning neural-networks personalization ranking scala search

Last synced: 17 Mar 2024

https://github.com/Aura-healthcare/hrv-analysis

Package for Heart Rate Variability analysis in Python

feature-engineering heart-rate-variability python rr-interval

Last synced: 17 Mar 2024

https://github.com/4paradigm/OpenMLDB

OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.

database-for-ai database-for-machine-learning feature-engineering feature-extraction feature-store featureops featurestore in-memory-database machine-learning machine-learning-database mlops

Last synced: 17 Mar 2024

https://github.com/ballet/ballet

☀️🦶 A lightweight framework for collaborative, open-source feature engineering

collaborative-data-science feature-engineering

Last synced: 14 Mar 2024