An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with preprocessing-data

A curated list of projects in awesome lists tagged with preprocessing-data .

https://github.com/cecivieira/cotas-genero-eleicoes-e-proposicoes-legislativas

Análise de dados sobre cotas de gênero e seu impacto nas eleições e proposições legislativas da Câmara dos Deputados Federais entre 1934 e 2021. Parte do TCC da pós-graduação em Inteligência Artificial e Aprendizado de Máquina na @pucminas

dataanalysis pandas preprocessing-data python randomforestclassifier

Last synced: 20 Sep 2025

https://github.com/rafiqamar/hr-analytics-project

Cleaned and processed HR data using Python for analysis and visualization. Analyzed employee trends and performance using SQL and Python. Built an interactive Power BI dashboard connected to MySQL for dynamic insights.

exploratory-data-analysis mysql-database powerbi preprocessing-data python

Last synced: 05 Apr 2025

https://github.com/maxbubblegum47/preprocessing

Preprocessing method for Information Retrieval System

algorithm algorithms preprocessing preprocessing-data python python3 unimore-informatica

Last synced: 22 Mar 2025

https://github.com/rafiqamar/customer-churn-prediction-app

Built and deployed a Streamlit-based customer churn prediction app using ML models. Preprocessed data with encoding and scaling, improving model accuracy. Designed for churn prediction and retention insights.

exploratory-data-analysis machine-learning-algorithms preprocessing-data python streamlit-webapp

Last synced: 05 Jul 2025

https://github.com/xndrxssx/cotton_candy_spectral_analysis

Diretório com os algoritmos de pré-processamento e modelos para análise de dados espectrais da uva de mesa Cotton Candy.

machine-learning-algorithms msc pca pcr plsr preprocessing-data random-forest savitzky-golay snv spectroscopy standard-normal-variate support-vector-machine

Last synced: 30 Jul 2025

https://github.com/himank-khatri/classiflow

A web app that automates tedious data preprocessing and machine learning model testing.

exploratory-data-analysis machinelearning preprocessing-data python streamlit vizualization

Last synced: 01 Aug 2025

https://github.com/ddihora1604/advanced_business_analytics_on_world_bank_global_financial_inclusion_data_2021

Bridging the Gaps in Financial Inclusion: Understanding the Cash-Credit Paradox, Divide between Cash and Digital Payments, and Financial Resilience.

advanced-excel business-analytics data-analysis data-engineering data-mining data-visualization database exploratory-data-analysis machine-learning preprocessing-data python

Last synced: 17 Oct 2025

https://github.com/vipanchip/ai-powered-recipe-recommendation

This project is a Recipe Recommendation System that suggests recipes tailored to the user's specified nutritional values and ingredients. It integrates machine learning techniques with an intuitive web application framework to provide personalized recipe suggestions.

css flask html knn-classification machine-learning preprocessing-data

Last synced: 08 Aug 2025

https://github.com/lucianoscarpaci/news-data-classification

Using the Reuters dataset, this example illustrates the process of data preprocessing, model definition and training, and performance evaluation.

keras model-definition model-training performance-evaluation preprocessing-data reuters scikit-learn seaborn tensorflow

Last synced: 06 Mar 2025

https://github.com/iroyalx/dataset_preprocessing_sample

UNI S6: Preprocessing in Data Mining using ucimlrepo

data-mining preprocess preprocessing preprocessing-data preprocessor

Last synced: 13 Jun 2025

https://github.com/jingvu/anime-database-preprocessing-r-project

During the data preprocessing step, I identified three tasks that I believe are crucial and require careful attention: data transformation, handling outliers, and managing missing values. This repository serves as a resource to share what I've learned on these topics for anyone interested.

anime-dataset preprocessing-data r rmarkdown

Last synced: 24 Dec 2025

https://github.com/tszon/data-science-projects

Included are all the worth-noting Data Science projects in my learning journey with DataCamp.

data-analysis data-science exploratory-data-analysis feature-engineering machine-learning modelling preprocessing-data scikit-learn supervised-learning

Last synced: 15 Mar 2025

https://github.com/gaurav-singh7092/resumatch

An AI-powered resume and job description matching application using natural language processing and machine learning techniques. This application provides intelligent analysis of resume-job compatibility with detailed scoring and recommendations.

fastapi keyword-extraction nextjs nlp preprocessing-data python similarity-score tailwind

Last synced: 02 Jul 2025

https://github.com/hayatiyrtgl/audio_processing_for_cnn_network

Spectrum creation is the most important thing while dealing with audio data

audio audio-processing librosa preprocessing preprocessing-data python stft

Last synced: 08 Apr 2025

https://github.com/shellynagar27/transportation-and-logistics-challenge

Analyzing logistics data to optimize shipment efficiency, reduce delays, and enhance supply chain visibility using Power BI. Insights include top routes, delays, supplier trends, and peak shipments.

cleaning-data critical-thinking data-analysis data-visualization exploratory-data-analysis feature-engineering powerbi preprocessing-data problem-solving python

Last synced: 08 Sep 2025

https://github.com/animesh-chourey/loan-classifier

Trained machine learning algorithms (Logistic Regression, KNN, SVM, Decision Tree) specifically, after performing visualization and pre-preocessing tasks on a loan dataset. Executed the evaluation metrics such as F1-score, Log loss and jaccard-similarity score to assess the algorithms performance.

decision-tree f1-score jaccard-similarity knn logistic-regression logloss matplotlib numpy pandas preprocessing-data svm

Last synced: 01 Mar 2025

https://github.com/jatin-mehra119/flight-price-prediction

This study aims to analyze flight booking data from "Ease My Trip" website, using statistical tests and linear regression to extract insights. By understanding this data, valuable information can be gained to benefit passengers using the platform.

data-analysis datacleaning datavisualization machine-learning preprocessing-data python sklearn-pipeline sklearn-regression-algorithm streamlit-webapp

Last synced: 10 Mar 2025

https://github.com/subhadipsinha722133/multiple-disease-prediction

🤖This is an interactive Streamlit web application that predicts the likelihood of multiple diseases(Diabetes Prediction, Heart Disease Prediction, Parkinson's Disease Prediction) using Machine Learning models.

machine-learning-algorithms prediction preprocessing-data sklearn streamlit

Last synced: 07 Oct 2025

https://github.com/Jingvu/Anime-Database-Preprocessing-R-Project

During the data preprocessing step, I identified three tasks that I believe are crucial and require careful attention: data transformation, handling outliers, and managing missing values. This repository serves as a resource to share what I've learned on these topics for anyone interested.

anime-dataset preprocessing-data r rmarkdown

Last synced: 13 Oct 2025

https://github.com/sarahloree/project-2--bank-loan-marketing-model

This is the second project I completed as part of the Machine Learning Module from my post-graduate certification in AI/ Machine Learning from University of Texas' McCombs School of Business.

business-analytics data-engineering decision-tree-classifier decision-trees eda modelbuilding modelevaluation performance-analysis performance-metrics performancemonitoring preprocessing-data

Last synced: 17 Oct 2025

https://github.com/lummy-a/montgomery-county-crime-analysis

Analysis of crime patterns in Montgomery County (2018-2022) using Python data science tools to identify trends, spatial hotspots, and temporal distributions across crime types. Includes visualizations and insights to inform prevention strategies.

analysis crime-analysis crime-data geospatial-analysis jupyter-notebook preprocessing-data python statistical-analysis visualization

Last synced: 30 Apr 2025

https://github.com/thiwak/preprocess-50k-tiles-sri-lanka

Preprocessing scripts for 1:50K tiles issued by the survey department, Sri Lanka

arcpy automation gdal-python geospatial preprocessing-data

Last synced: 25 Oct 2025

https://github.com/himank-khatri/classification-builder

A web app that automates tedious data preprocessing and machine learning model testing.

exploratory-data-analysis machinelearning preprocessing-data python streamlit vizualization

Last synced: 02 Mar 2025

https://github.com/saadhaniftaj/ai-essayscore-automated-essay-scoring-using-lstm

AI-EssayScore is an automated essay scoring system using LSTM neural networks. It tokenizes and pads essays, processes them through an LSTM model, and predicts scores. The project includes data preprocessing, model training, evaluation, and saving the model for future use.

automated-machine-learning evaluation-metrics intro-to-ai lstm preprocessing-data

Last synced: 20 Mar 2025

https://github.com/multiomics-analytics-group/acore

Functionality to preprocess and analyse multi-omics data

analysis omics omics-data-integration preprocessing-data

Last synced: 12 Apr 2025

https://github.com/tejaswirupa/early-prediction-of-diabetes-risk-using-machine-learning

Built a predictive model using CDC health data to identify individuals at risk of developing diabetes. Achieved 90.6% F1-score using Logistic Regression and revealed key health indicators like BMI and blood pressure as top predictors.

data-science datacleaning exploratory-data-analysis modelevaluation preprocessing-data python scikit-learn supervised-machine-learning

Last synced: 15 Jul 2025

https://github.com/hoangleminh17/ranks-prediction-for-lol

A method to predict rankings based on performances of players for game League Of Legends

jupyter-notebook league-of-legends linear-regression predictive-modeling preprocessing-data python3 ridge-regression

Last synced: 16 Jul 2025

https://github.com/lucasdsbr/ai-data-preprocessing

Data preprocessing for Artificial Intelligence

data-science googlecolab preprocessing-data python

Last synced: 23 Feb 2025

https://github.com/bhavinpatel4199/machine-learning-framework

This repository, showcases various projects that explore key concepts in both supervised and unsupervised learning, with a focus on real-world applications. The projects utilize a range of machine learning techniques, including data preprocessing, feature selection, exploratory data analysis (EDA), and model optimization.

classification clustering data-science data-structures data-visualization exploratory-data-analysis machine-learning machine-learning-algorithms machine-learning-models pandas-dataframe predictive-modeling preprocessing-data sklearn supervised-learning unsupervised-learning

Last synced: 07 Apr 2025

https://github.com/datafog/datafog

Python library to redact PII/business information from entering semantic data pipelines (RAG, 'chat on your data')

ai embeddings llm ml mlops pii preprocessing preprocessing-data privacy privacy-protection privacy-tools rag semantic-analysis

Last synced: 25 Feb 2025

https://github.com/mohd-faizy/preprocess_ml

This repository hosts Python code that utilizes the Scikit-learn preprocessing API for data preprocessing. The code presents a comprehensive range of tools that handle missing data, scale data, encode categorical variables, and perform other functions.

data-science feature-engineering feature-engineering-algorithm feature-extraction feature-selection machine-learning outlier-detection preprocessing-data preprocessor scikit-learn

Last synced: 16 Sep 2025