Projects in Awesome Lists tagged with imputation
A curated list of projects in awesome lists tagged with imputation .
https://github.com/WenjieDu/PyPOTS
A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values
classification clustering data-mining data-science deep-learning forecasting healthcare imputation incomplete industrial interpolation machine-learning missing-values missingness neural-network partially-observed-time-series pytorch science-research time-series time-series-analysis
Last synced: 01 Apr 2025
https://github.com/moment-timeseries-foundation-model/moment
MOMENT: A Family of Open Time-series Foundation Models, ICML'24
anomaly-detection classification forecasting foundational-models imputation large-language-models time-series time-series-anomaly-detection time-series-classification time-series-forecasting transformers
Last synced: 12 Mar 2026
https://github.com/amices/mice
Multivariate Imputation by Chained Equations
chained-equations fcs imputation mice missing-data missing-values multiple-imputation multivariate-data
Last synced: 12 Dec 2025
https://github.com/awslabs/datawig
Imputation of missing values in tables.
imputation missing-value-handling
Last synced: 06 Apr 2025
https://github.com/mims-harvard/units
A unified multi-task time series model.
anomaly-detection classification ecg eeg few-shot forecasting foundation-models imputation multi-task prompt-tuning time-series unified-model zero-shot
Last synced: 10 Oct 2025
https://github.com/eltonlaw/impyute
Data imputations library to preprocess datasets with missing data
imputation missing-data python scientific-computing
Last synced: 04 Apr 2025
https://github.com/WenjieDu/SAITS
The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516
attention attention-mechanism deep-learning imputation imputation-model impute incomplete-data incomplete-time-series interpolation irregular-sampling machine-learning missing-values partially-observed partially-observed-data partially-observed-time-series pytorch self-attention time-series time-series-imputation transformer
Last synced: 01 Apr 2025
https://github.com/wenjiedu/tsdb
a Python toolbox loads 172 public time series datasets for machine/deep learning with a single line of code. Datasets from multiple domains including healthcare, financial, power, traffic, weather, and etc.
classification data-mining database deep-learning forecasting imputation machine-learning partially-observed-time-series time-series time-series-analysis time-series-database time-series-datasets
Last synced: 17 Mar 2026
https://github.com/david-cortes/isotree
(Python, R, C/C++) Isolation Forest and variations such as SCiForest and EIF, with some additions (outlier detection + similarity + NA imputation)
anomaly-detection imputation isolation-forest outlier-detection
Last synced: 15 May 2025
https://github.com/dvgodoy/handyspark
HandySpark - bringing pandas-like capabilities to Spark dataframes
exploratory-data-analysis imputation outlier-detection pandas pyspark python spark visualization
Last synced: 05 Apr 2025
https://github.com/WenjieDu/TSDB
a Python toolbox loads 172 public time series datasets for machine/deep learning with a single line of code. Datasets from multiple domains including healthcare, financial, power, traffic, weather, and etc.
classification data-mining database deep-learning forecasting imputation machine-learning partially-observed-time-series time-series time-series-analysis time-series-database time-series-datasets
Last synced: 01 Apr 2025
https://github.com/vanderschaarlab/hyperimpute
A framework for prototyping and benchmarking imputation methods
data-science imputation imputation-algorithm machine-learning machine-learning-prerequisites preprocessing-data python scikit-learn
Last synced: 07 Apr 2025
https://github.com/SteffenMoritz/imputeTS
CRAN R Package: Time Series Missing Value Imputation
cran data-visualization imputation imputation-algorithm imputets missing-data time-series
Last synced: 26 Mar 2025
https://github.com/steffenmoritz/imputets
CRAN R Package: Time Series Missing Value Imputation
cran data-visualization imputation imputation-algorithm imputets missing-data time-series
Last synced: 05 Apr 2025
https://github.com/sylvaticus/betaml.jl
Beta Machine Learning Toolkit
ai artificial-intelligence autoencoder classification clustering data-science decision-trees deep-learning feature-importance imputation julia machine-learning ml neural-networks pca random-forest regression
Last synced: 17 Mar 2025
https://github.com/Vivianstats/scImpute
Accurate and robust imputation of scRNA-seq data
imputation r-package single-cell-rna-seq
Last synced: 09 Apr 2025
https://github.com/markvanderloo/simputation
Making imputation easy
data-science imputation officialstatistics r rstats
Last synced: 22 Oct 2025
https://github.com/jisungk/riddle
Race and ethnicity Imputation from Disease history with Deep LEarning
bioinformatics biology computational-biology deep-learning epidemiology imputation machine-learning neural-networks
Last synced: 08 Jul 2025
https://github.com/urbslab/streamline
Simple Transparent End-To-End Automated Machine Learning Pipeline for Supervised Learning in Tabular Binary Classification Data
automl-pipeline binary-classification data-science data-visualization feature-selection imputation machine-learning model-application statistical-analysis supervised-learning
Last synced: 12 Jul 2025
https://github.com/genepi/nf-gwas
A nextflow pipeline to perform state-of-the-art genome-wide association studies.
gwas imputation nextflow regenie singularity slurm
Last synced: 02 Feb 2026
https://github.com/mayer79/missranger
Fast multivariate imputation by random forests.
imputation machine-learning missing-values r random-forest rstats
Last synced: 24 Oct 2025
https://github.com/mayer79/missRanger
Fast multivariate imputation by random forests.
imputation machine-learning missing-values r random-forest rstats
Last synced: 26 Apr 2025
https://github.com/thierrygosselin/radiator
RADseq Data Exploration, Manipulation and Visualization using R
artifacts-detection batch-effects filter gbs genetics genomic-data-analysis genomics genomics-visualization genotype-likelihoods genotyping-by-sequencing heterozygosity imputation missingness normalization outliers outliers-detection paralogs radseq radseq-data visualization
Last synced: 11 Aug 2025
https://github.com/danielhanchen/sciblox
sciblox - Easier Data Science and Machine Learning
boosting data-analysis data-mining data-preprocessing data-science data-visualization imputation machine-learning python sklearn
Last synced: 08 Mar 2026
https://github.com/gianlucatruda/quantified-sleep
Quantified Sleep: Machine learning techniques for observational n-of-1 studies.
biohacking data-science explainable-ai imputation interpretable-machine-learning lasso machine-learning missing-data observational-studies oura-ring prediction quantified-self rescuetime sleep time-series
Last synced: 30 Apr 2025
https://github.com/qhliu26/awesome-time-series-analysis
📖 A curated list of awesome time-series papers, benchmarks, datasets, tutorials. (WIP)
anomaly-detection change-point-detection classification clustering data-mining forecasting imputation llm machine-learning segmentation time-series-analysis
Last synced: 30 Dec 2025
https://github.com/randel/MixRF
A random-forest-based approach for imputing clustered incomplete data
gene-expression imputation mixed-models random-forest
Last synced: 26 Apr 2025
https://github.com/baggepinnen/totalleastsquares.jl
Solve many kinds of least-squares and matrix-recovery problems
errors-in-variables estimation imputation least-square-regression least-squares linear-regression matrix-completion missing-data missing-data-imputation nonnegative-matrix-factorization outlier-detection robust-estimation robust-pca robust-regresssion robust-statistics singular-value-decomposition total-least-square
Last synced: 26 Jan 2026
https://github.com/haghish/mlim
mlim: single and multiple imputation with automated machine learning
automatic-machine-learning automl classimbalance data-science elastic-net extreme-gradient-boosting gbm glm gradient-boosting gradient-boosting-machine imputation imputation-algorithm imputation-methods machine-learning missing-data multipleimputation r rstats rstats-package stack-ensemble
Last synced: 19 Feb 2026
https://github.com/iskandr/knnimpute
Python implementations of kNN imputation
imputation machine-learning missing-data statistics
Last synced: 09 Mar 2026
https://github.com/zhengxwen/hibag
R package – HLA Genotype Imputation with Attribute Bagging (development version only)
bioinformatics gpu hla imputation mhc r snp
Last synced: 06 Apr 2025
https://github.com/maize-genetics/phg_v2
Practical Haplotype Graph (PHG) version 2
imputation pangenome pangenome-graph
Last synced: 11 Feb 2026
https://github.com/simongrund1/mitml
Tools for multiple imputation in multilevel modeling
imputation missing-data mixed-effects multilevel-data multilevel-models r r-package
Last synced: 07 May 2025
https://github.com/harry24k/mida-pytorch
PyTorch implementation of "MIDA: Multiple Imputation using Denoising Autoencoders"
autoencoder deep-learning imputation pytorch
Last synced: 10 Apr 2025
https://github.com/nerler/jointai
Joint Analysis and Imputation of generalized linear models and linear mixed models with missing values
bayesian generalized-linear-models glm glmm imputation imputations jags joint-analysis linear-mixed-models linear-regression-models mcmc-sample mcmc-sampling missing-data missing-values rstats survival
Last synced: 22 Oct 2025
https://github.com/clear-nus/NCDSSM
PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series".
continuous-time forecasting icml-2023 imputation kalman-filter state-space-model time-series
Last synced: 20 Mar 2025
https://github.com/filippob/introduction_to_gwas
https://filippob.github.io/introduction_to_gwas/
gwas imputation linear-regression pipeline
Last synced: 25 Oct 2025
https://github.com/raamana/missingdata
missing data handing: visualize and impute
biostatistics data-science dirty-data epidemiology imputation machine-learning missing-data missing-values neuroscience visualization
Last synced: 13 Apr 2025
https://github.com/jishanshaikh4/sti
Resources and code for the Store Transaction Imputation Hackathon by Nielson (India)
imputation store techgig techgig-solutions transaction
Last synced: 25 Apr 2025
https://github.com/andreaskapou/Melissa
Bayesian Clustering and Imputation of Single Cell Methylomes
bayesian-inference clustering imputation methylation variational-inference
Last synced: 09 Apr 2025
https://github.com/tom-metherell/mice.jl
a package for missing data handling via multiple imputation by chained equations in Julia. It is heavily based on the R package {mice} by Stef van Buuren, Karin Groothuis-Oudshoorn and collaborators.
imputation julia mice missing-data multiple-imputation statistics
Last synced: 21 Oct 2025
https://github.com/Oafish1/JAMIE
Joint variational Autoencoders for Multimodal Imputation and Embedding (JAMIE)
autoencoder imputation integration multimodal variational variational-autoencoder
Last synced: 09 May 2026
https://github.com/mwheymans/psfmi
psfmi: Predictor Selection Functions for Logistic and Cox regression models in multiply imputed datasets
cox-regression imputation imputed-datasets logistic multiple-imputation pool predictor regression selection spline spline-predictors
Last synced: 22 Oct 2025
https://github.com/biogenies/imputomics
amputation imputation metabolomics missing-values shiny webserver
Last synced: 15 May 2025
https://github.com/kennethleungty/datawig-missing-data-imputation
Imputation of Missing Data in Tables
data-imputation data-science datawig deep-learning imputation machine-learning
Last synced: 12 Jul 2025
https://github.com/macarro/imputena
Python package that allows both automated and customized treatment of missing values in datasets
imputation missing-data python
Last synced: 14 Jan 2026
https://github.com/transbiozi/gimpute
An efficient genetic data imputation pipeline
genotyping gwas haplotypes imputation liftover phasing
Last synced: 03 Apr 2026
https://github.com/mcuntz/hesseflux
hesseflux provides functions used in the processing and post-processing of the Eddy covariance flux data of the ICOS ecosystem site FR-Hes.
eddy-covariance filter gap-filling gpp imputation mad python spikes ustar
Last synced: 27 Jan 2026
https://github.com/boennecd/mdgc
Provides functions to impute missing values using Gaussian copulas for mixed data types.
binary gaussian-copula imputation multinomial-variables ordinal semi-parametric
Last synced: 22 Oct 2025
https://github.com/datapreprocessing/datacleaning
Data Cleaning is a python package for data preprocessing. This cleans the CSV file and returns the cleaned data frame. It does the work of imputation, removing duplicates, replacing special characters, and many more.
data data-cleaning data-cleansing data-preprocessing data-wrangling imputation python threshold
Last synced: 14 Dec 2025
https://github.com/samankhamesian/imputation-of-missing-values
This project is an implementation of hybrid method for imputation of missing values
fuzzy-cmeans-clustering fuzzy-logic genetic-algorithm hybrid-application imputation missing-data missing-values python support-vector-regression
Last synced: 30 Jul 2025
https://github.com/csfelix/data-science-mental-maps
🐍 Mental Maps Related to Contents in Data Science 🐍
computer-vision cross-validation data-preparation data-science data-transformation deep-learning encoder feature-engineering imputation machine-learning normalization one-hot-encoder ordinal-encoder pickle pipelines python scale shap-values standardization xgboost
Last synced: 28 Apr 2025
https://github.com/corymccartan/birdie
Bayesian Instrumental Regression for Disparity Estimation
imputation r racial-disparities statistics
Last synced: 17 Jul 2025
https://github.com/ssmiler/idash2019_2
Secure genotype imputation using homomorphic encryption - iDASH 2019 track 2
genome-imputation genomics homomorphic-encryption idash imputation machine-learning
Last synced: 12 Oct 2025
https://github.com/udaylab/geoanalytics
artificial-intelligence imputation numpy python raster-data statistics
Last synced: 08 Oct 2025
https://github.com/cran-task-views/missingdata
CRAN Task View: Missing Data
cran imputation missing-data r rstats task-views
Last synced: 13 Apr 2025
https://github.com/shangzhi-hong/rfempimp
Multiple Imputation using Chained Random Forests
imputation missing-data random-forest
Last synced: 22 Oct 2025
https://github.com/joshweiner/ml-impute
A package for synthetic data generation for imputation using single and multiple imputation methods.
imputation imputation-methods jax machine-learning multiple-imputation numpy pandas parallelization singular-value-decomposition synthetic-data synthetic-dataset-generation
Last synced: 18 Jul 2025
https://github.com/pavlin-policar/alra
Imputation method for scRNA-seq based on low-rank approximation
batch-effects imputation matrix-completion scrna-seq svd
Last synced: 15 Aug 2025
https://github.com/bdslab-upv/extremiss
Numerical data imputation methods for extremely missing data contexts
classification data-quality imputation imputation-methods machine-learning missing-data missing-data-imputation
Last synced: 01 Feb 2026
https://github.com/sadmansakib93/missing-value-imputaion-knn
Python implementaion of missing value imputation using K-Nearest-Neighbour and Weighted K-Nearest-Neighbour
imputaion-knn imputation impute-algorithm knearest-neighbour knn minmaxscalar missing-values python-implementaion scaling standard-scalar weighted-knn
Last synced: 03 May 2025
https://github.com/fangzhouli/para-impute
Missing value imputation package in Python specialized for High-performance computing.
computer-clus hpc imputation impute missforest missing-data missing-values python random-forest slurm
Last synced: 02 Apr 2026
https://github.com/zhengxwen/hibag.gpu
GPU-based implementation for the HLA genotype imputation method (HIBAG)
Last synced: 07 Jul 2025
https://github.com/stonegor/ae-imputer
A python package used for missing data imputation via autoencoders.
data-science deep-learning imputation machine-learning python pytorch
Last synced: 14 Jan 2026
https://github.com/jeffreyevans/yaimpute
Nearest neighbor-based imputation on multivariate data
cran imputation r r-package rstats
Last synced: 15 Mar 2025
https://github.com/mkirchmeyer/adaptation-imputation
Unsupervised domain adaptation with non-stochastic missing data
digital-advertising domain-adaptation imputation missing-data
Last synced: 20 Oct 2025
https://github.com/teebusch/mifa
An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.
factor-analysis imputation rstats
Last synced: 17 Mar 2025
https://github.com/ai-sandbox/aegen
Autoencoders for genomic data compression, classification, imputation, phasing and simulation.
classification compression imputation phasing simulation
Last synced: 16 Jan 2026
https://github.com/moindalvs/learn_eda_for_data_science
Univariate, Bivariate and Multi-variate Analysis
bivariate-analysis correlation-analysis data-science data-transformation data-type-conversion data-types-and-structures data-visualization duplicates-removal exploratory-data-analysis imputation missing-values multi-variate-analysis normalization outlier-detection pandas-profiling standardization univariate-analysis
Last synced: 07 Oct 2025
https://github.com/tymill/synthpred
A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
arima automl ensemble flux imputation julia machine-learning synthetic-data time-series
Last synced: 22 Apr 2025
https://github.com/scarface987/imputetoolkit
🔍 Evaluate and compare imputation methods with consistent metrics using the intuitive S3 interface of the `imputetoolkit` R package.
benchmarking cpp data-quality devtools evaluation-metrics imputation missing-data missing-data-imputation r rcpp roxygen2 testthat usethis
Last synced: 18 May 2026
https://github.com/jonaprieto/imputation
ARSI imputation algorithm for categorical databases
arsi imputation missing-values roustida vtrida
Last synced: 19 Jan 2026
https://github.com/hasnainroopawalla/super-resolution-vehicle-trajectory
A Master Thesis project to increase the temporal resolution of vehicle trajectories using recurrent time series imputation.
deep-learning imputation python time-series trajectory
Last synced: 11 Apr 2025
https://github.com/inbo/multimput
multimput is an R package that assists with analysing dataset with missing values using multiple imputation.
imputation imputation-model package r
Last synced: 04 Mar 2026
https://github.com/viralemergence/trefle
Imputing the mammalian virome with the LF-SVD model
imputation svd verena virology zoonotic-disease
Last synced: 20 Feb 2026
https://github.com/leabrodyheine/ml-kaggle-cirrhosis-data
This project showcases skills in machine learning, data preprocessing, and model evaluation using Python libraries such as scikit-learn, XGBoost, and Optuna. It involves implementing various machine learning models, handling imbalanced data, and employing imputation techniques to enhance model performance for predicting cirrhosis outcomes.
data-analysis data-pre imbalanced-data imputation machine-learning optuna pipeline scikit-learn xgboost
Last synced: 14 May 2026
https://github.com/vaneseltine/impute-gender
This is more like a set of notes than a useful repository
gender gender-from-name gender-prediction imputation sociology
Last synced: 16 Jan 2026
https://github.com/luckyos-code/user-driven-privacy
Data preparation methods for supporting machine learning on anonymized tabular data with generalized and missing values.
anonymized-data data-preparation imputation incomplete-data machine-learning privacy tabular-data
Last synced: 31 May 2026
https://github.com/ugurcan222/a-different-approach--image-enhancement-with-imputation-and-regression-methods
This experimental work presents a different approach to increase the size and quality of an image by adding a blank pixel around each pixel in an image, enlarging the image, breaking it into parts, and generating these blank pixels by predicting them with models.
ai-image-upscaling computer-vision digital-image-processing gradient-boosting image-analysis image-enhancement image-enlargement image-interpolation image-processing imputation knn machine-learning numpy opencv pixel-prediction python randomforest regression-models super-resolution xgboost
Last synced: 17 Jan 2026
https://github.com/lefteris-souflas/modern-slavery-analysis
Jupyter notebook using machine learning techniques to explore the complex drivers of modern slavery. Models from a research paper are replicated and evaluated . Actions also include filling missing data, training regression models, and analyzing feature importance.
decision-tree feature-importance grid-search-cv imputation jupyter-notebook lasso-regression linear-regression matplotlib mean-absolute-error numpy pandas preprocessing principal-component-analysis python3 random-forest ridge-regression scikit-learn seaborn
Last synced: 09 Apr 2026
https://github.com/tanveer09/imputetoolkit
imputeToolkit is an R package designed to help users apply, compare, and visualise multiple imputation methods. It automates the process of masking known values, applying different imputation strategies, and evaluating their performance with clear metrics and visualisations.
benchmarking cpp data-quality devtools evaluation-metrics imputation missing-data missing-data-imputation r r-package rcpp roxygen2 testthat usethis
Last synced: 19 May 2026
https://github.com/aefdz/localfda
Localization processes for functional data analysis. Software companion for the paper “Localization processes for functional data analysis” by Elías, A., Jiménez, R., and Yukich, J. (2020)
classification functional-data-analysis imputation outliers-detection
Last synced: 22 Oct 2025
https://github.com/maheera421/car-price-prediction-model
A machine learning project that predicts car prices based on a dataset.
column-transformer cross-validation-score feature-encoding feature-engineering imputation mean-absolute-error mean-squared-error one-hot-encoding r2-score random-forest-regressor simple-imputer
Last synced: 14 Mar 2025
https://github.com/baschin1103/sliding-variance-with-imputation
Calculation of the sliding variance with imputation
csv imputation linear-interpolation missing-values python sliding statistics variance
Last synced: 05 Apr 2025
https://github.com/kwokhing/wids-datathon-patient-survival
A challenge to create a model that uses data from the first 24 hours of intensive care to predict patient survival
feature-engineering gradient-boosting-machine imputation kaggle lightgbm machine-learning
Last synced: 01 May 2026
https://github.com/sap/knn-sampler
Machine learning imputation method with multiple imputation and uncertainty quantification support based on kNN
Last synced: 15 Sep 2025
https://github.com/dayadau/gdp-defl-2000
Visualise GDP deflator development group by income level in 2000 using RStudio, specifically RMarkDown file.
Last synced: 29 Jul 2025
https://github.com/dayadau/gdp_defl_2000
Visualise GDP deflator development group by income level in 2000 using RStudio, specifically RMarkDown file.
Last synced: 07 Jul 2025
https://github.com/jeffreysarnoff/imputationalgamest.jl
last observation carry forward
imputation locf missing-data nans
Last synced: 11 Feb 2026
https://github.com/phydev/mice
Multiple imputation with chained equation implemented from scratch. This is a low performance implementation meant for pedagogical purposes only.
data-cleaning data-science imputation mice-algorithm missingness multiple-imputation
Last synced: 15 Mar 2025
https://github.com/abdulrahmanaymann/data-mining
data mining project involving two tasks: a regression problem and a classification problem.
classification data-mining imputation jupyter-notebook knn linear-regression outlier-detection polynomial-regression preprocessing python regression scaling
Last synced: 21 Aug 2025
https://github.com/inbo/drat
A repository with R packages created and maintained by INBO
bookdown drat ggplot2 ggplot2-themes imputation packages r rmarkdown-templates
Last synced: 12 Jun 2026
https://github.com/joshuajose978/data_cleaning_and_imputation
A data science project that evaluates the effectiveness of different imputation techniques using synthetic datasets. The workflow involves generating rule-based synthetic data with missing values, applying three imputation methods (MICE, KNN, and Mean imputation), and comparing their performance through a dashboard visualization.
Last synced: 08 Jun 2026
https://github.com/lemma-osu/sknnr
scikit-learn compatible estimators for various kNN imputation methods
classification gnn gradient-nearest-neighbor imputation k-nearest-neighbor knn most-similar-neighbor msn random-forest-nearest-neighbor regression rfnn scikit-learn sklearn-estimator
Last synced: 23 Feb 2026
https://github.com/jfeser/imputedb
A database with automatic imputation of missing values.
Last synced: 17 May 2026
https://github.com/maheera421/bulldozer-price-prediction-model
Prediction of the auction prices of bulldozers using historical data.
datetime-formatters feature-importance hyperparameter-tuning imputation mean-absolute-error mean-squared-log-error parsing preprocessing random-forest-regressor randomizedsearchcv seaborn-plots
Last synced: 05 Oct 2025