Projects in Awesome Lists tagged with data-preprocessing-and-cleaning
A curated list of projects in awesome lists tagged with data-preprocessing-and-cleaning .
https://github.com/tracebloc/data-ingestors
tracebloc data pipeline for training/test dataset setup
data-ingestion data-pipeline data-preparation data-preprocessing-and-cleaning data-validation tracebloc
Last synced: 10 Jun 2026
https://github.com/usk2003/weather-data-analysis-in-szeged
This repository contains an in-depth analysis of historical weather data from Szeged, Hungary. The project uses Python to clean and process data, generate insightful visualizations, and identify patterns and correlations in weather parameters such as temperature, humidity, and precipitation.
analysis data-preprocessing-and-cleaning jupyter-notebook prediction python szeged weather-analysis
Last synced: 15 Apr 2026
https://github.com/e-panourgia/data-science-projects
Data Science Projects
annotations augmentation data data-preprocessing-and-cleaning hyperparameter-tuning llm logistic-regression nlp random-forest-classifier xboost-classifier
Last synced: 09 Apr 2025
https://github.com/atharvkadammm/calmlytic
An end-to-end machine learning project that predicts anxiety severity using classification models (Naive Bayes, Decision Tree, SVM, Logistic Regression, XGBoost), based on lifestyle, health, and behavioral features.
anxiety-prediction classification csv data-analysis data-preprocessing-and-cleaning data-science data-visualization ensemble-learning logistic-regression machine-learning-algorithms matplotlib mental-health numpy pandas python sci-kit-learn seaborn supervised-learning svm xgboost
Last synced: 21 Jun 2025
https://github.com/jdonepud/nlp-sentimentclassification
data-preprocessing-and-augmentation data-preprocessing-and-cleaning deep-learning gpt2 natural-language-processing-nlp pytorch-implementation scikit-learn-python sentiment-analysis text-classification-python transformers-models wordcloud-visualization
Last synced: 13 Oct 2025
https://github.com/madhurimarawat/big-data-analytics
This repository demonstrates big data processing, visualization, and machine learning using tools such as Hadoop, Spark, Kafka, and Python.
apache-kafka apache-spark big-data big-data-analytics big-data-analytics-techniques data-preprocessing-and-cleaning data-stratification data-visualization hadoop-hdfs hadoop-hive hadoop-installation hadoop-mapreduce hiveql python spark-graphx spark-mllib spark-mllib-library spark-rdd spark-streaming
Last synced: 05 Apr 2026
https://github.com/amanovishnu/anamoly-detection-using-decision-classifier
the kdd 99 anomaly detection application is a flask web app that predicts anomalies in the kdd 99 dataset using a decision tree classifier. it allows users to input features for prediction and offers a user-friendly interface with real-time predictions and low latency.
anamoly-detection data-preprocessing-and-cleaning decision-tree-classifier flask-application kdd-99-dataset machine-learning machine-learning-algorithms
Last synced: 08 Sep 2025
https://github.com/nashish109/smart-ecommerce-fraud-detection
AI-powered system to detect fraudulent transactions in e-commerce using machine learning. Includes data preprocessing, feature engineering, and classification models like Random Forest and XGBoost. Achieved high accuracy with interpretable results for real-time detection.
classification-report-analysis data-preprocessing-and-cleaning ensemble-learning-with-xgboost feature-engineering graph-attention-networks imbalanced-data-handling machine-learning-models model-evaluation-and-metrics smote-sampling-technique
Last synced: 17 May 2026
https://github.com/imnotamr/ai
A collection of machine learning and AI projects implemented in Jupyter notebooks, covering regression, classification, and neural networks
ai classification colab-notebook data-analysis data-preprocessing data-preprocessing-and-cleaning data-visualization deep-learning deep-neural-networks jupyter-notebook machine-learning model-evaluation predictive-modeling project-based-learning python supervised-learning supervised-learning-algorithms supervised-learning-classifiers unsupervised-learning unsupervised-learning-algorithms
Last synced: 17 May 2026