An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with data-science-projects

A curated list of projects in awesome lists tagged with data-science-projects .

https://github.com/imsanjoykb/data-science-regular-bootcamp

Regular practice on Data Science, Machien Learning, Deep Learning, Solving ML Project problem, Analytical Issue. Regular boost up my knowledge. The goal is to help learner with learning resource on Data Science filed.

artificial-intelligence data-analysis data-science data-science-notebook data-science-projects data-visualization database-connection deep-learning etl-pipeline etl-process feature-engineering machine-learning mysql-database neural-network numpy pandas postgresql python python-automation sqlite

Last synced: 30 Oct 2025

https://github.com/yusufcinarci/data-science-projects

In this repo, there are (beginner-upper) level projects in the field of data science. I will host these projects that I have done in this field every day in this repo. With the hope that it will be useful to those who are interested in the field of data science like me and will just start...

data-analysis data-science data-science-projects jupyter jupyter-notebook python

Last synced: 25 Oct 2025

https://github.com/aw-junaid/computer-science

Explore a collection of resources and projects in Computer Science, covering algorithms, data structures, programming languages, and emerging technologies. Ideal for learners and enthusiasts looking to enhance their knowledge and skills in the field

algorithms assembly-language automata computer-architecture computer-networks computer-science computer-vision cpp cybersecurity data-science data-science-projects data-structures database game-development machine-learning networking operating-system python

Last synced: 26 Mar 2025

https://github.com/amey-thakur/python-crash-course

IIT ROPAR - Diginique Techlabs --> Data Science Machine Learning and AI using Python

ai amey ameythakur data-science data-science-projects house-price-prediction machine-learning python python-crash-course

Last synced: 07 Oct 2025

https://github.com/ammarlodhi255/student_performance_indicator_end-to-end_implementation

An end-to-end machine learning project, student performance indicator. The goal of this project is to understand the influence of the parents background, test preparation, and various other variables on the students performance.

aws cd-pipeline data-analysis data-science data-science-projects eda end-to-end-machine-learning machine-learning machine-learning-projects regression regression-analysis

Last synced: 27 Sep 2025

https://github.com/santiagxf/mlproject-sample

Sample repository about how to structure an ML project using software engineering practices

data-science data-science-projects git machine-learning mlops

Last synced: 24 Apr 2025

https://github.com/tushar2704/superstore-sales-dashboard-with-streamlit

Superstore Sales with Streamlit is a data visualization and analysis project that uses the Streamlit framework to create an interactive web application for exploring and analyzing sales data from a superstore. This project aims to provide an easy-to-use interface for users to gain insights into sales trends, Sales performance, product performance,

analytics dashboard data-analytics data-science data-science-projects python streamlit streamlit-tushar2704 trend-analysis tushar2704

Last synced: 07 May 2025

https://github.com/theakashshukla/r-project

🎓 A Collection of Programming Assignment for R Language

algorithms data-analysis data-science data-science-projects ml r

Last synced: 24 Jul 2025

https://github.com/md-emon-hasan/ml-project-laptop-price-prediction

💻 Laptop Price Prediction projects contains predicts laptop prices based on user-input specifications using a pre-trained machine learning model.

ai bootstrap data-science-projects flask laptop-price-prediction machine-learning-projects price-prediction

Last synced: 21 Sep 2025

https://github.com/d-coder111/practolearn

PractoLearn serves as a repository for acquiring fundamental knowledge through hands-on projects suitable for beginners and individuals of all skill levels. We encourage you to participate and warmly welcome your contributions.

beginner-friendly c c-programming competitive-programming contributions-welcome cpp data-science-projects data-structures first-contributions hacktoberfest hacktoberfest-accepted improvement-proposal java projects-list python web-development

Last synced: 20 Aug 2025

https://github.com/joshuathadi/data-science

Assignments and notes from the IBM Data Science Professional Certificate. Exercises, and key concepts related to data science. Extracting insights from large datasets to support strategic decision-making.

coursera-assignment data-science-notes data-science-projects

Last synced: 19 Jun 2025

https://github.com/wahidpanda/sql.ai-text-to-sql-code

SQL.AI is an interactive web application built with Streamlit and powered by Google's GenerativeAI. It allows users to interactively retrieve SQL data through natural language queries and predefined SQL commands. The application connects to a SQLite database, executes SQL queries, and provides visualizations of the retrieved data.

artificial-intelligence data-science data-science-projects data-scienced data-visualization gemini-pro generative-ai machine-learning text-to-sql

Last synced: 14 Apr 2025

https://github.com/sharmas1ddharth/mushroom_classification

Machine Learning Model to classify whether a Mushroom is Edible or Poisonous by its features

data-science data-science-projects machienlearning mushroom-classification projects python

Last synced: 29 Jul 2025

https://github.com/njlyon0/collab_bilingualism

R / Python bilingualism website/tutorial project

data-science data-science-projects python r teaching-tool tutorials

Last synced: 20 Mar 2025

https://github.com/iguptashubham/metro-operations-optimization

Metro Operations Optimization refers to the systematic process of enhancing the efficiency, reliability, and effectiveness of Metro services through various data-driven techniques and operational adjustments.

data-analysis-project data-science data-science-projects

Last synced: 01 Dec 2025

https://github.com/faizanzaheergit/ddos-detection-ml

Various supervised machine learning techniques on the highly optimized NSL-KDD dataset to create an efficient and accurate predictor of possible intrusions on a network.

artificial-intelligence csv-files data-science data-science-projects dataset ddos-detection machine-learning matplotlib matplotlib-pyplot ml-algorithms python python-machine-learning sklearn

Last synced: 05 Jul 2025

https://github.com/ahammadmejbah/supplemental-materials

Explore AI with resources like 📚 books, 🎧 podcasts, and 🖥️ online courses! Join forums 💬, attend workshops 🛠️, and read journals 📖 to boost your knowledge! 🚀✨🧠

computer-vision data-science data-science-projects data-visualization deep-learning deep-neural-networks machine machine-learning opencv python3

Last synced: 26 Feb 2025

https://github.com/virajbhutada/music-recommendation-system

This project is designed to provide personalized music recommendations for relaxation and meditation. Leveraging ML and data analysis, the system suggests tracks based on user preferences such as tempo, energy, and genre. Join us in enhancing music discovery through advanced algorithms and community-driven contributions.

data-analysis data-science-projects data-visualization eda html machine-learning ml-algortihms model-deployment model-evaluation music-recommendation-system nlp pivot-table principal-component-analysis python python-library similarity-matrix spotify-data streamlit-web user-experience

Last synced: 27 Feb 2025

https://github.com/sharmas1ddharth/iris-classification

A Machine Learning Model that can classify the species of the Iris flower whether its Iris-Setosa, Iris-Virsicolour, Iris-Virginica

data-science data-science-projects iris-classification iris-dataset machine-learning machine-learning-projects project python

Last synced: 28 Feb 2025

https://github.com/iguptashubham/customer-intent-prediction

Logistic regression is a binary classification technique used to predict outcomes like customer churn or purchase intent. It models the probability of an event happening (e.g., a customer making a purchase) based on input features.

data-science data-science-projects logistic-regression machine-learning machine-learning-projects machinelearning project sklearn

Last synced: 09 Aug 2025

https://github.com/scarblase/homeless-animals-analysis

A data-driven exploration of homeless animal statistics 🐶🐱. Analyze age distribution, shelter dynamics, and adoption patterns using Python, Pandas, and Seaborn.

animals data-analysis data-mining data-science data-science-projects data-visualization matplotlib matplotlib-pyplot numpy pandas plotly python python3 ukraine

Last synced: 24 Jul 2025

https://github.com/nafisalawalidris/elfeenah

Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and blockchain converge.

artificial-intelligence bitcoin blockchain config data data-science-portfolio data-science-projects datascience datascientist deep-learning github-config machinelearning

Last synced: 11 Sep 2025

https://github.com/amirzenoozi/aparat-videos-dataset

Some Simple Information About Aparat Videos for DataScientists

aparat cli crawler data-science data-science-projects pandas python python3 sdk-python sqlite3 video

Last synced: 14 Mar 2025

https://github.com/virajbhutada/global-universities-success-analysis-powerbi-sql-excel

This capstone project conducts in-depth analysis using Power BI, SQL, and Excel to explore complex dynamics shaping global university success. Integrating data from diverse ranking systems and criteria, our aim is to unravel the factors influencing universities worldwide.

capstone capstoneproject data-analysis data-analytics data-insights data-science data-science-projects data-visualization excel exploratory-data-analysis mece mysql powerbi powerpoint sql

Last synced: 20 Jun 2025

https://github.com/saravana-kr22/phonepe_pulse_data_visualization_and_exploration

The PhonePe Pulse Data Visualization project in Python extracts, transforms, and stores data from the PhonePe Pulse GitHub repository. It creates an interactive dashboard using Streamlit, Plotly, and other libraries to visualize the data. Users can explore various insights from the data spanning 2018 to 2023.

choropleth-map data-science data-science-projects data-visualization geojson geopandas indiamapdata mapbox mysql phonepe phonepe-clone phonepe-pulse-data-visualization plotly pydeck python python3 streamlit streamlit-webapp visualization

Last synced: 29 Oct 2025

https://github.com/sharmas1ddharth/disease_prediction

Predict underlying disease by providing symptoms using Machine Learning Algorithms

data-science data-science-projects datascience flask machine-learning machinelearning python

Last synced: 28 Feb 2025

https://github.com/mscbuild/analysis

🎢 This collection of data analysis projects demonstrates techniques for extracting, transforming, analyzing, and visualizing data. Data Analytics Projects for Beginners 📈 ⚡

anallysis analysis chart csv dashboard data data-science data-science-projects excel google html5 mashine-learning portfolio pyton

Last synced: 19 Oct 2025

https://github.com/iguptashubham/ott-churn-eda-ml

Understanding why customers discontinue their subscriptions will be crucial in optimizing the user experience, reducing churn, and maximizing customer lifetime value. By using Machine learning model to predict the Customer Churn.

data-analysis data-analysis-project data-science data-science-portfolio data-science-projects data-visualization machine-learning python

Last synced: 11 Jun 2025

https://github.com/lixx21/coursera-data-scarping

End-to-end project to scraping courses in coursera

data-science-projects data-scraping scrapy streamlit

Last synced: 03 Apr 2025

https://github.com/md-emon-hasan/4-eda-football-ml-app

A ML application focused on exploratory data analysis and football analytics, featuring data visualization and insights using Python and relevant libraries.

data-science-projects data-visualization eda exploratory-data-analysis football-analytics sports-analytics webapp

Last synced: 02 Mar 2025

https://github.com/wambugu71/auto_eda_dsail

Automating process of EDA (Explaratory Data Analysis) with Generative AI and opensource python tools.

automation data-analytics data-science data-science-projects dataanalysis deeplearning eda explanatory-data-analysis machine-learning

Last synced: 16 Mar 2025

https://github.com/teja-1403/movie-recommendation-system-using-python

The main goal of this machine learning project is to build a recommendation engine that recommends movies to users. This project is designed to help understand how a recommendation system works. We have developed User, Item and Model Based Collaborative Filter. This project helped me gain experience of implementing Python, Data Science and ML...

data-science-projects machine-learning python

Last synced: 02 Apr 2025

https://github.com/neerajcodes888/diwali-sales-analysis

An open-source repository for sales data analysis. Dive into insightful trends, metrics, and visualizations to empower data-driven decision-making. Ideal for data analysts, business professionals, and enthusiasts seeking comprehensive sales insights. Clone, customize, and contribute to enhance your sales analytics journey.

data-science-projects data-visualization numpy pandas-dataframe python3 sales-analysis seaborn-plots

Last synced: 26 Mar 2025

https://github.com/soupu07/retail_analysis_with_walmart_data

The aim of this project is to do the retail analysis with Walmart data.

data-science data-science-projects python regression walmart walmart-sales-forecasting

Last synced: 06 Mar 2025

https://github.com/soupu07/ibm-employee-attrition-prediction

The aim of this project analyzes factors driving IBM employee attrition and predicts those likely to leave, helping the organization understand turnover causes and improve retention and performance.

data-science data-science-projects data-visualization machine-learning python python-data-analysis python-programming-language python-project

Last synced: 06 Mar 2025

https://github.com/pragati928/cancer-severity-prediction-ml

📊 End-to-end data science project predicting cancer severity using Python, EDA, and Random Forests — focusing on lifestyle and genetic factors.

data-analysis-python data-science-projects eda machine-learning pandas-python random-forest scikit-learn visualizations

Last synced: 08 Oct 2025

https://github.com/ahammadmejbah/petrol-price-forecasting-with-lstm-arima-and-automl

Create a price forecast using the most effective model available according to your preferences. Check for any extreme values or values that are missing.

arima arima-forecasting arima-model autokeras automl data data-science data-science-projects forcasting project python

Last synced: 26 Feb 2025

https://github.com/urbanekda/upwork_dashboard

A data analysis project examining trends and patterns in the data science job market on Upwork. This project analyzes job postings, requirements, and market demands to provide insights into the freelance data science ecosystem.

data-analysis data-science data-science-projects data-visualization freelance jupyter-notebook python streamlit

Last synced: 10 Sep 2025

https://github.com/nimbostratos/titanic-survival-prediction

Machine learning project predicting Titanic survival using AdaBoost with feature engineering and hyperparameter optimization

data-analysis data-science data-science-projects kaggle machine-learning machine-learning-models python scikit-learn

Last synced: 08 Oct 2025

https://github.com/beolawork-art/novabank-churn-analysis

NovaBank has noticed that customers are closing accounts or going inactive, and they want to understand why.

data-analysis data-science-projects data-visualization eda machine-learning numpy pandas python scikit-learn sql

Last synced: 18 Nov 2025

https://github.com/masum184e/exploratory_data_analysis_projects

This space to showcase my journey in exploring various datasets, uncovering patterns, and extracting meaningful insights. Each project highlights different aspects of EDA, demonstrating techniques and tools that are essential for making sense of data.

data-analysis data-analysis-projects data-science data-science-projects eda eda-projects exploratory-data-analysis exploratory-data-analysis-projects

Last synced: 31 Mar 2025

https://github.com/piyushkumar2025/analytical-sql-project-exploring-trends-segmentation-kpis

A complete SQL analytics project using a simulated data warehouse. It analyzes sales, customer, and product data with CTEs, joins, window functions, subqueries, and views to deliver insights on trends, segmentation, and KPIs, showing how SQL enables data-driven decisions without BI tools.

advanced-sql analytics business-intelligence data data-science-projects datascience joins kpi mysql query sql window-functions-in-sql

Last synced: 02 Jul 2025

https://github.com/aliahmad552/youtube-transcript-rag

This is a YouTube Q&A Chatbot powered by a Large Language Model (LLM) and FastAPI. Users can enter a YouTube video URL and ask questions — the system generates accurate answers using the video transcript.

data-science-projects deep-learning fastapi genai generative generative-ai langchain langchain-huggingface machine-learning projects rag retrieval-augmented-generation

Last synced: 02 Nov 2025

https://github.com/avijay24/predictingonlinenewspopularity

Using Machine Learning Algorithms, created a model for predicting the popularity of an online news article on Mashable with 79% accuracy along with forecasting a profitability of $1,182,400 using the cost matrix for model

cost-matrix data-science-projects machine-learning mashable news-popularity-prediction predictive-analytics python

Last synced: 24 Feb 2025

https://github.com/mehboob14/nextdayrainprediction

As part of a course project in Data Science, I worked on the Nextday Rain Predictions dataset sourced from Kaggle. The project involved end-to-end processes, including data preprocessing, handling class imbalance, and model evaluation. I explored various models such as Logistic Regression and Random Forest, applied appropriate preprocessing techniq

data-science data-science-projects jupyter-notebooks logistic-regression preprocessing python random-forest

Last synced: 15 Mar 2025

https://github.com/arbuz13/data-portfolio

📊 Showcase data projects in engineering, machine learning, and business intelligence, emphasizing technical processes and business impacts.

data-analysis data-science-projects data-visualization hadoop hiveq jupyter-notebook machine-learning matplotlib nlp pandas portfolio-project portfolio-site python react react-portfolio recommendation-system seaborn spark

Last synced: 30 Dec 2025

https://github.com/asghar-rizvi/health-risk-prediction-platform-with-flask-and-machine-learning

A Health Risk Prediction Platform using Flask and machine learning to predict heart attacks, kidney disease, liver disease, and diabetes. Features a user-friendly interface with HTML, CSS, and JavaScript, along with secure authentication, encrypted passwords, and session management. MySQL is used for database operations, achieving 98% model accurac

backend css data-science-machine-learning data-science-projects datascience doctor-machine-learning flask frontend html javascript machine-learning python real-world-ml-project webdevelopment website

Last synced: 21 Nov 2025

https://github.com/shubhamprajapati7748/bank-customer-churn-prediction

The Bank Customer Churn Prediction app uses deep learning to predict if a bank customer will churn (leave) based on demographic and account-related data. Powered by a deep learning ANN model with TensorFlow and built with Streamlit for the front-end, this app provides an interactive interface to predict customer churn in real-time.

customer-churn-prediction data-science data-science-projects deep-learning machine-learning

Last synced: 14 Mar 2025

https://github.com/manjit-baishya-datascience/pakistan-house-price-eda

This project has effectively analyzed house price trends and prediction using machine learning, emphasizing data cleaning, exploratory analysis, and regression modeling to gain insights into dataset patterns and structures.

data-science-projects ensemble-machine-learning machine-learning-algorithms regression visualization

Last synced: 02 Mar 2025

https://github.com/ahshasa/stock-data-visualizer

A Python-based stock market visualizer that fetches real-time data from Alpha Vantage API and displays interactive stock price trends in a GUI. Built with Python, Tkinter, Pandas, and Matplotlib.

data-science-projects data-visualization finance financial-analysis jupyter-notebook lstm lstm-neural-networks stock stock-market stock-prediction stock-price-prediction stocks streamlit yahoo-finance

Last synced: 13 Jun 2025

https://github.com/soupu07/image_classification

The project aimed to develop a machine learning model using TensorFlow and Keras to classify images of clothing items from the Fashion MNIST dataset.

data-science data-science-projects deep-learning keras-tensorflow python python-data-analysis python-programming-language tensorflow

Last synced: 08 Apr 2025

https://github.com/miguelmedinacastro/trabalho-dados-r

Trabalho final da disciplina Análise Exploratória de Dados

data data-science data-science-projects data-visualization database r rstudio

Last synced: 22 Mar 2025

https://github.com/magnus0969/heart-diease-eda

Exploratory Data Analysis (EDA) on heart disease data to uncover key risk factors and patterns. This project utilizes Python, Pandas, Seaborn, and Matplotlib to visualize trends, correlations, and insights that contribute to heart disease prediction and prevention.

data-insights data-science-projects data-visualization heart-disease-analysis python

Last synced: 04 Mar 2025

https://github.com/shubhamprajapati7748/end-to-end-house-price-prediction

A machine learning model that accurately predicts housing prices using the Boston Housing dataset by analyzing various house features, and it utilizes a CatBoost model to assist potential buyers or sellers in estimating housing prices.

boston-housing-price-prediction data-analysis data-science-projects machine-learning regression regression-models

Last synced: 30 Oct 2025

https://github.com/sanveed-adnan/supermarket-sales-sql-project

SQL-based data analysis project on supermarket sales performance using SQLite and Power BI.

business-intelligence data-analysis data-science data-science-projects data-visualization power-bi sales-data sql sqlite

Last synced: 08 Nov 2025

https://github.com/tanusssss/ai_fraudguard_chatbot

FraudGuard AI Chatbot : Gen-AI powered fraud detection assistant using KNN + Hugging Face LLM. Upload CSVs, detect risky transactions, and ask natural questions. Built with Streamlit, Flan‑T5, and custom rule-based features.

chatbot data-science-projects fraud-detection huggingface knn llm machine-learning portfolio-project streamlit

Last synced: 04 Jul 2025

https://github.com/zahramh99/anomaly-detection-in-transactions

Anomaly detection in transactions means identifying unusual or unexpected patterns within transactions or related activities. These patterns, known as anomalies or outliers, deviate significantly from the expected norm and could indicate irregular or fraudulent behaviour.

anomaly-detection data-science data-science-portfolio data-science-projects financial-data fraud-detection isolation-forest machine-learning outlier-detection transactions unsupervised-learning

Last synced: 12 Jun 2025

https://github.com/dbolotov/ds-quickref

Quick reference guide for data science and machine learning concepts, workflows, and tools.

data-science data-science-projects python quarto

Last synced: 07 Aug 2025