An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with data-science-projects

A curated list of projects in awesome lists tagged with data-science-projects .

https://github.com/imsanjoykb/data-science-regular-bootcamp

Regular practice on Data Science, Machien Learning, Deep Learning, Solving ML Project problem, Analytical Issue. Regular boost up my knowledge. The goal is to help learner with learning resource on Data Science filed.

artificial-intelligence data-analysis data-science data-science-notebook data-science-projects data-visualization database-connection deep-learning etl-pipeline etl-process feature-engineering machine-learning mysql-database neural-network numpy pandas postgresql python python-automation sqlite

Last synced: 30 Oct 2025

https://github.com/yusufcinarci/data-science-projects

In this repo, there are (beginner-upper) level projects in the field of data science. I will host these projects that I have done in this field every day in this repo. With the hope that it will be useful to those who are interested in the field of data science like me and will just start...

data-analysis data-science data-science-projects jupyter jupyter-notebook python

Last synced: 14 Mar 2026

https://github.com/aw-junaid/computer-science

Explore a collection of resources and projects in Computer Science, covering algorithms, data structures, programming languages, and emerging technologies. Ideal for learners and enthusiasts looking to enhance their knowledge and skills in the field

algorithms assembly-language automata computer-architecture computer-networks computer-science computer-vision cpp cybersecurity data-science data-science-projects data-structures database game-development machine-learning networking operating-system python

Last synced: 26 Mar 2025

https://github.com/amey-thakur/python-crash-course

IIT ROPAR - Diginique Techlabs --> Data Science Machine Learning and AI using Python

ai amey ameythakur data-science data-science-projects house-price-prediction machine-learning python python-crash-course

Last synced: 07 Oct 2025

https://github.com/ammarlodhi255/student_performance_indicator_end-to-end_implementation

An end-to-end machine learning project, student performance indicator. The goal of this project is to understand the influence of the parents background, test preparation, and various other variables on the students performance.

aws cd-pipeline data-analysis data-science data-science-projects eda end-to-end-machine-learning machine-learning machine-learning-projects regression regression-analysis

Last synced: 27 Sep 2025

https://github.com/santiagxf/mlproject-sample

Sample repository about how to structure an ML project using software engineering practices

data-science data-science-projects git machine-learning mlops

Last synced: 24 Apr 2025

https://github.com/tushar2704/superstore-sales-dashboard-with-streamlit

Superstore Sales with Streamlit is a data visualization and analysis project that uses the Streamlit framework to create an interactive web application for exploring and analyzing sales data from a superstore. This project aims to provide an easy-to-use interface for users to gain insights into sales trends, Sales performance, product performance,

analytics dashboard data-analytics data-science data-science-projects python streamlit streamlit-tushar2704 trend-analysis tushar2704

Last synced: 07 May 2025

https://github.com/theakashshukla/r-project

🎓 A Collection of Programming Assignment for R Language

algorithms data-analysis data-science data-science-projects ml r

Last synced: 24 Jul 2025

https://github.com/md-emon-hasan/ml-project-laptop-price-prediction

💻 Laptop Price Prediction projects contains predicts laptop prices based on user-input specifications using a pre-trained machine learning model.

ai bootstrap data-science-projects flask laptop-price-prediction machine-learning-projects price-prediction

Last synced: 21 Sep 2025

https://github.com/iguptashubham/metro-operations-optimization

Metro Operations Optimization refers to the systematic process of enhancing the efficiency, reliability, and effectiveness of Metro services through various data-driven techniques and operational adjustments.

data-analysis-project data-science data-science-projects

Last synced: 11 Jun 2026

https://github.com/d-coder111/practolearn

PractoLearn serves as a repository for acquiring fundamental knowledge through hands-on projects suitable for beginners and individuals of all skill levels. We encourage you to participate and warmly welcome your contributions.

beginner-friendly c c-programming competitive-programming contributions-welcome cpp data-science-projects data-structures first-contributions hacktoberfest hacktoberfest-accepted improvement-proposal java projects-list python web-development

Last synced: 20 Aug 2025

https://github.com/wahidpanda/sql.ai-text-to-sql-code

SQL.AI is an interactive web application built with Streamlit and powered by Google's GenerativeAI. It allows users to interactively retrieve SQL data through natural language queries and predefined SQL commands. The application connects to a SQLite database, executes SQL queries, and provides visualizations of the retrieved data.

artificial-intelligence data-science data-science-projects data-scienced data-visualization gemini-pro generative-ai machine-learning text-to-sql

Last synced: 14 Apr 2025

https://github.com/joshuathadi/data-science

Assignments and notes from the IBM Data Science Professional Certificate. Extracting insights from large datasets to support strategic decision-making.

coursera-assignment data-science-notes data-science-projects ibm ibm-data-science-professional-certificate ibm-data-science-projects

Last synced: 17 Apr 2026

https://github.com/sharmas1ddharth/mushroom_classification

Machine Learning Model to classify whether a Mushroom is Edible or Poisonous by its features

data-science data-science-projects machienlearning mushroom-classification projects python

Last synced: 29 Jul 2025

https://github.com/njlyon0/collab_bilingualism

R / Python bilingualism website/tutorial project

data-science data-science-projects python r teaching-tool tutorials

Last synced: 20 Mar 2025

https://github.com/ahammadmejbah/supplemental-materials

Explore AI with resources like 📚 books, 🎧 podcasts, and 🖥️ online courses! Join forums 💬, attend workshops 🛠️, and read journals 📖 to boost your knowledge! 🚀✨🧠

computer-vision data-science data-science-projects data-visualization deep-learning deep-neural-networks machine machine-learning opencv python3

Last synced: 17 May 2026

https://github.com/faizanzaheergit/ddos-detection-ml

Various supervised machine learning techniques on the highly optimized NSL-KDD dataset to create an efficient and accurate predictor of possible intrusions on a network.

artificial-intelligence csv-files data-science data-science-projects dataset ddos-detection machine-learning matplotlib matplotlib-pyplot ml-algorithms python python-machine-learning sklearn

Last synced: 20 May 2026

https://github.com/virajbhutada/music-recommendation-system

This project is designed to provide personalized music recommendations for relaxation and meditation. Leveraging ML and data analysis, the system suggests tracks based on user preferences such as tempo, energy, and genre. Join us in enhancing music discovery through advanced algorithms and community-driven contributions.

data-analysis data-science-projects data-visualization eda html machine-learning ml-algortihms model-deployment model-evaluation music-recommendation-system nlp pivot-table principal-component-analysis python python-library similarity-matrix spotify-data streamlit-web user-experience

Last synced: 24 Jan 2026

https://github.com/sharmas1ddharth/iris-classification

A Machine Learning Model that can classify the species of the Iris flower whether its Iris-Setosa, Iris-Virsicolour, Iris-Virginica

data-science data-science-projects iris-classification iris-dataset machine-learning machine-learning-projects project python

Last synced: 09 Jun 2026

https://github.com/negativenagesh/unfake

UnFake is the first platform to integrate a deepfake detection tool directly into the image-downloading process. Check Live by pressing below link -->

computer-vision data-science data-science-projects deep-fake deep-fake-detection deep-learning deep-learning-projects deepfake-detection efficientnetb7 image-processing python streamlit-webapp

Last synced: 27 Feb 2026

https://github.com/iguptashubham/customer-intent-prediction

Logistic regression is a binary classification technique used to predict outcomes like customer churn or purchase intent. It models the probability of an event happening (e.g., a customer making a purchase) based on input features.

data-science data-science-projects logistic-regression machine-learning machine-learning-projects machinelearning project sklearn

Last synced: 14 May 2026

https://github.com/nafisalawalidris/elfeenah

Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and blockchain converge.

artificial-intelligence bitcoin blockchain config data data-science-portfolio data-science-projects datascience datascientist deep-learning github-config machinelearning

Last synced: 11 Sep 2025

https://github.com/amirzenoozi/aparat-videos-dataset

Some Simple Information About Aparat Videos for DataScientists

aparat cli crawler data-science data-science-projects pandas python python3 sdk-python sqlite3 video

Last synced: 17 May 2026

https://github.com/virajbhutada/global-universities-success-analysis-powerbi-sql-excel

This capstone project conducts in-depth analysis using Power BI, SQL, and Excel to explore complex dynamics shaping global university success. Integrating data from diverse ranking systems and criteria, our aim is to unravel the factors influencing universities worldwide.

capstone capstoneproject data-analysis data-analytics data-insights data-science data-science-projects data-visualization excel exploratory-data-analysis mece mysql powerbi powerpoint sql

Last synced: 20 Jun 2025

https://github.com/sharmas1ddharth/disease_prediction

Predict underlying disease by providing symptoms using Machine Learning Algorithms

data-science data-science-projects datascience flask machine-learning machinelearning python

Last synced: 15 Apr 2026

https://github.com/wambugu71/auto_eda_dsail

Automating process of EDA (Explaratory Data Analysis) with Generative AI and opensource python tools.

automation data-analytics data-science data-science-projects dataanalysis deeplearning eda explanatory-data-analysis machine-learning

Last synced: 18 Jun 2026

https://github.com/neerajcodes888/diwali-sales-analysis

An open-source repository for sales data analysis. Dive into insightful trends, metrics, and visualizations to empower data-driven decision-making. Ideal for data analysts, business professionals, and enthusiasts seeking comprehensive sales insights. Clone, customize, and contribute to enhance your sales analytics journey.

data-science-projects data-visualization numpy pandas-dataframe python3 sales-analysis seaborn-plots

Last synced: 26 Mar 2025

https://github.com/mscbuild/analysis

🎢 This collection of data analysis projects demonstrates techniques for extracting, transforming, analyzing, and visualizing data. Data Analytics Projects for Beginners 📈 ⚡

anallysis analysis chart csv dashboard data data-science data-science-projects excel google html5 mashine-learning portfolio pyton

Last synced: 19 Oct 2025

https://github.com/saravana-kr22/phonepe_pulse_data_visualization_and_exploration

The PhonePe Pulse Data Visualization project in Python extracts, transforms, and stores data from the PhonePe Pulse GitHub repository. It creates an interactive dashboard using Streamlit, Plotly, and other libraries to visualize the data. Users can explore various insights from the data spanning 2018 to 2023.

choropleth-map data-science data-science-projects data-visualization geojson geopandas indiamapdata mapbox mysql phonepe phonepe-clone phonepe-pulse-data-visualization plotly pydeck python python3 streamlit streamlit-webapp visualization

Last synced: 25 Feb 2026

https://github.com/suryaaxc/Movie-Matcher-Flex

High-performance Movie Recommendation System scaling to 32M+ records. Built with Python, TF-IDF Vectorization, and Neon UI. Optimized for real-time cinematic discovery.

big-data cosine-similarity data-science-projects git-lfs machine-learning python recommendation-system streamlit tf-idf

Last synced: 24 Jun 2026

https://github.com/teja-1403/movie-recommendation-system-using-python

The main goal of this machine learning project is to build a recommendation engine that recommends movies to users. This project is designed to help understand how a recommendation system works. We have developed User, Item and Model Based Collaborative Filter. This project helped me gain experience of implementing Python, Data Science and ML...

data-science-projects machine-learning python

Last synced: 27 Apr 2026

https://github.com/lixx21/coursera-data-scarping

End-to-end project to scraping courses in coursera

data-science-projects data-scraping scrapy streamlit

Last synced: 01 May 2026

https://github.com/scarblase/homeless-animals-analysis

A data-driven exploration of homeless animal statistics 🐶🐱. Analyze age distribution, shelter dynamics, and adoption patterns using Python, Pandas, and Seaborn.

animals data-analysis data-mining data-science data-science-projects data-visualization matplotlib matplotlib-pyplot numpy pandas plotly python python3 ukraine

Last synced: 06 May 2026

https://github.com/iguptashubham/ott-churn-eda-ml

Understanding why customers discontinue their subscriptions will be crucial in optimizing the user experience, reducing churn, and maximizing customer lifetime value. By using Machine learning model to predict the Customer Churn.

data-analysis data-analysis-project data-science data-science-portfolio data-science-projects data-visualization machine-learning python

Last synced: 08 May 2026

https://github.com/md-emon-hasan/4-eda-football-ml-app

A ML application focused on exploratory data analysis and football analytics, featuring data visualization and insights using Python and relevant libraries.

data-science-projects data-visualization eda exploratory-data-analysis football-analytics sports-analytics webapp

Last synced: 08 May 2026

https://github.com/zahramh99/anomaly-detection-in-transactions

Anomaly detection in transactions means identifying unusual or unexpected patterns within transactions or related activities. These patterns, known as anomalies or outliers, deviate significantly from the expected norm and could indicate irregular or fraudulent behaviour.

anomaly-detection data-science data-science-portfolio data-science-projects financial-data fraud-detection isolation-forest machine-learning outlier-detection transactions unsupervised-learning

Last synced: 12 Jun 2025

https://github.com/aliahmad552/youtube-transcript-rag

This is a YouTube Q&A Chatbot powered by a Large Language Model (LLM) and FastAPI. Users can enter a YouTube video URL and ask questions — the system generates accurate answers using the video transcript.

data-science-projects deep-learning fastapi genai generative generative-ai langchain langchain-huggingface machine-learning projects rag retrieval-augmented-generation

Last synced: 09 Apr 2026

https://github.com/soupu07/retail_analysis_with_walmart_data

The aim of this project is to do the retail analysis with Walmart data.

data-science data-science-projects python regression walmart walmart-sales-forecasting

Last synced: 02 May 2026

https://github.com/piyushkumar2025/analytical-sql-project-exploring-trends-segmentation-kpis

A complete SQL analytics project using a simulated data warehouse. It analyzes sales, customer, and product data with CTEs, joins, window functions, subqueries, and views to deliver insights on trends, segmentation, and KPIs, showing how SQL enables data-driven decisions without BI tools.

advanced-sql analytics business-intelligence data data-science-projects datascience joins kpi mysql query sql window-functions-in-sql

Last synced: 02 Jul 2025

https://github.com/vzamboulingame/data-portfolio

This repository showcases my projects in Python and SQL, highlighting my skills in data analysis & visualization.

data-analysis data-portfolio data-science data-science-portfolio data-science-projects data-visualization jupyter-notebook portfolio python sql

Last synced: 20 May 2026

https://github.com/oklein1/association-miner-api

Explore the Association Miner API, a lightweight and versatile machine learning tool designed for effortless integration and predictive insights. Uncover valuable patterns in diverse datasets, from supermarket transactions to online user behavior, with the power of association rule learning at your fingertips.

association-rule-mining association-rules clojure data-mining data-science-projects

Last synced: 07 Oct 2025

https://github.com/pragati928/cancer-severity-prediction-ml

📊 End-to-end data science project predicting cancer severity using Python, EDA, and Random Forests — focusing on lifestyle and genetic factors.

data-analysis-python data-science-projects eda machine-learning pandas-python random-forest scikit-learn visualizations

Last synced: 08 Oct 2025

https://github.com/dbolotov/ds-quickref

Quick reference guide for data science and machine learning concepts, workflows, and tools.

data-science data-science-projects python quarto

Last synced: 20 May 2026

https://github.com/deaneeth/churn-prediction-model-training

Step-by-step guide to building machine learning models for customer churn prediction, continuing from the data preprocessing phase. The repo covers training, evaluation, and saving of models, with weekly updates.

churn-prediction data-science-projects jupyter-notebook machine-learning model-evaluation model-training model-training-and-evaluation python scikit-learn

Last synced: 11 May 2026

https://github.com/mehboob14/nextdayrainprediction

As part of a course project in Data Science, I worked on the Nextday Rain Predictions dataset sourced from Kaggle. The project involved end-to-end processes, including data preprocessing, handling class imbalance, and model evaluation. I explored various models such as Logistic Regression and Random Forest, applied appropriate preprocessing techniq

data-science data-science-projects jupyter-notebooks logistic-regression preprocessing python random-forest

Last synced: 19 May 2026

https://github.com/soupu07/ibm-employee-attrition-prediction

The aim of this project analyzes factors driving IBM employee attrition and predicts those likely to leave, helping the organization understand turnover causes and improve retention and performance.

data-science data-science-projects data-visualization machine-learning python python-data-analysis python-programming-language python-project

Last synced: 05 May 2026

https://github.com/shubhamprajapati7748/end-to-end-house-price-prediction

A machine learning model that accurately predicts housing prices using the Boston Housing dataset by analyzing various house features, and it utilizes a CatBoost model to assist potential buyers or sellers in estimating housing prices.

boston-housing-price-prediction data-analysis data-science-projects machine-learning regression regression-models

Last synced: 30 Oct 2025

https://github.com/nimbostratos/titanic-survival-prediction

Machine learning project predicting Titanic survival using AdaBoost with feature engineering and hyperparameter optimization

data-analysis data-science data-science-projects kaggle machine-learning machine-learning-models python scikit-learn

Last synced: 05 May 2026

https://github.com/urbanekda/upwork_dashboard

A data analysis project examining trends and patterns in the data science job market on Upwork. This project analyzes job postings, requirements, and market demands to provide insights into the freelance data science ecosystem.

data-analysis data-science data-science-projects data-visualization freelance jupyter-notebook python streamlit

Last synced: 07 May 2026

https://github.com/mwasifanwar/automl_framework

Comprehensive AutoML framework that automates data preprocessing, feature engineering, model selection, hyperparameter tuning, and deployment. Features neural architecture search and automated data cleaning pipelines.

automl automl-algorithms data-science data-science-projects feature-engineering feature-engineering-algorithm feature-engineering-ml hyperparameter-optimization machine-learning machine-learning-algorithms machine-learning-models mlops mlops-workflow python scikit-learn scikit-learn-python

Last synced: 07 May 2026

https://github.com/avijay24/predictingonlinenewspopularity

Using Machine Learning Algorithms, created a model for predicting the popularity of an online news article on Mashable with 79% accuracy along with forecasting a profitability of $1,182,400 using the cost matrix for model

cost-matrix data-science-projects machine-learning mashable news-popularity-prediction predictive-analytics python

Last synced: 22 Jun 2026

https://github.com/tanusssss/ai_fraudguard_chatbot

FraudGuard AI Chatbot : Gen-AI powered fraud detection assistant using KNN + Hugging Face LLM. Upload CSVs, detect risky transactions, and ask natural questions. Built with Streamlit, Flan‑T5, and custom rule-based features.

chatbot data-science-projects fraud-detection huggingface knn llm machine-learning portfolio-project streamlit

Last synced: 28 Apr 2026

https://github.com/incalculable-driverslicence975/data-projects-portfolio

📊 Showcase data projects that highlight analytics, machine learning, and MLOps with reproducible code and clear business insights.

ai computer-vision dashboard data-science-projects data-visualization deep-learning etl excel finance hadoop hiveq keras machine-learning nlp pandas portfolio-project scikit-learn tableau-dashboards

Last synced: 28 Apr 2026

https://github.com/miguelmedinacastro/trabalho-dados-r

Trabalho final da disciplina Análise Exploratória de Dados

data data-science data-science-projects data-visualization database r rstudio

Last synced: 01 May 2026

https://github.com/virajbhutada/diamond-price-estimator

This project develops a predictive model to estimate diamond prices based on characteristics like carat, cut, color, and clarity. It covers data preprocessing, feature engineering, model selection, training, and evaluation. The final product is a web app where users can input diamond attributes to get accurate and instant price predictions.

cross-validation css data-analysis data-science-projects data-visualization eda feature-engineering html hyperparameter-tuning jupyter-notebooks machine-learning ml-algorithms model-deployment model-selection performance-optimization predictive-modeling python python-app user-interface

Last synced: 14 Apr 2026