An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/kenwuqianghao/scotiabank-datathon-2023

Code and data analysis done for 2023 Scotiabank Datathon

data-analysis fraud-detection jupyter-notebook python

Last synced: 18 May 2026

https://github.com/sabdikay/telco-customer-churn-analysis-ibm-dataset

This project explores customer churn trends for a company in California using an IBM dataset. Built in a Jupyter Notebook, it employs pandas, NumPy, matplotlib, seaborn, plotly, and scipy to clean, analyze, and visualize data. Through statistical tests and interactive maps, it uncovers key drivers behind customer cancellations

business-intelligence customer-churn data-analysis data-analysis-python data-visualization exploratory-data-analysis jupyter-noteboook matplotlib numpy pandas plotly predictive-modeling python scipy seaborn statistical-analysis

Last synced: 07 Apr 2026

https://github.com/manuelgil/vscode-data-pack

This extension pack includes the essential extensions for data analysts.

data-analysis data-science data-structures data-visualization vscode-extension

Last synced: 07 Apr 2026

https://github.com/prarthana-singh/heart-attack-prediction-model

A Machine Learning model that predicts the risk of a heart attack based on health parameters like cholesterol levels, blood pressure, BMI, smoking habits, and age. Built using Classification models, Scikit-Learn, Pandas, and Python.

classification data-analysis data-science heart-attack-prediction logistic-regression machine-learning numpy pandas python scikit-learn

Last synced: 25 Jun 2025

https://github.com/robinmillford/analyzing-e-commerce-transactions---data-cleaning-cohort-analysis-and-sql

In this project, I aimed to analyze the profitability of products in an e-commerce dataset. I performed various SQL queries to extract valuable insights about product profitability, including the identification of the top 5 products with the highest profit margin, and unique combinations of brands and product lines with the highest profitability.

cohort-analysis data-analysis data-visualization excel jupyter-notebook powerbi python3 sql

Last synced: 18 May 2026

https://github.com/kammarah/studentdata

I created & deployed a Streamlit app to store, manage & analyze student data. 📊🎓

connection data data-analysis data-visualization deploy deployments libraries python streamlit streamlit-webapp webapp

Last synced: 18 May 2026

https://github.com/stefagnone/data_storyboarding_visualization

Data Storyboarding and Visualization Techniques for Effective Communication

data-analysis data-visualization ggplot2-analysis r tableau-dashboards

Last synced: 05 Apr 2025

https://github.com/stefagnone/moneyball_project

Data-driven analysis inspired by the Moneyball approach, identifying affordable replacements for key Oakland A's players using R and sabermetrics to support cost-effective recruitment.

baseball-statistics data-analysis data-driven-decision-making player-replacement-strategy r-programming sabermetrics sports-analytics

Last synced: 05 Apr 2025

https://github.com/rorrell/rightwhaledata

A Jupyter Notebook where I wrangle some data on right whale sightings and create a visualization

data-analysis data-visualization jupyter-notebook python3

Last synced: 11 May 2026

https://github.com/jayita11/exploring-most-streamed-songs-for-last-four-decades-eda

Perform EDA to uncover trends in streaming patterns, likes, and artists over the last four decades.

data-analysis eda hypothesis-testing matplotlib most-streamed-songs pandas python seaborn

Last synced: 07 Apr 2026

https://github.com/surajsanap/employee-resigning-analysis-powerbi-dashboard-data-analytics

Effortlessly analyze employee resignations with our concise Power BI dashboard. Download the XML file, open the dashboard, and gain quick insights into resignation trends and reasons for departure. Streamlined and effective

dashboard data-analysis data-analytics powerbi python xml-dataset

Last synced: 08 May 2025

https://github.com/enayar478/nomad_machine_learning_dash_app

An interactive Machine Learning app built with Dash and Plotly, developed as part of the Data Analytics Bootcamp at Le Wagon Bordeaux. It allows users to visualize data, make real-time predictions, and explore various model insights.

analytics cachetools dash dashboard-application data-analysis data-science deployment gunicorn interactive-visualization machine-learning pandas plotly plotly-dash prediction-model python python3 render scikit-learn web-application

Last synced: 02 Jan 2026

https://github.com/oshinrathor/Data-Science-Systems-and-Analytics-Projects

Dive into my Data Science Projects Repository, featuring a Spam SMS Classifier, NIA Dashboard, H1N1 Vaccine Prediction, and NYC Taxi Fare Prediction. Each project showcases my skills in data cleaning, exploratory analysis, modeling, and visualization, offering valuable insights and methodologies for data enthusiasts and practitioners.

dashboard data-analysis data-driven-decisions data-presentation data-science data-visualization dataexploration eda insights nia webanalytics

Last synced: 12 Sep 2025

https://github.com/akash1070/predicting-zomato-restaurant-ratings

Perform extensive Exploratory Data Analysis(EDA) on the Zomato Dataset. Building an appropriate Machine Learning Model that will help various Zomato Restaurants to predict their respective Ratings based on certain features deploy the Machine learning model via Flask

data-analysis extratreesregressor flask linear-regression machine-learning random-forest zomato-bangalore zomato-data-analysis

Last synced: 18 May 2026

https://github.com/huynhtanphatt/diagnosing-uk-railway-performances

This project analyzes UK railway ticket and operation data to show how revenue, passenger demand, and on-time performance are connected.

data-analysis data-visualization datastorytelling python railway sql ticketing transportation

Last synced: 24 Apr 2026

https://github.com/sebastianurdaneguibisalaya/enfermedades-fissal

Análisis holístico de atenciones por enfermedades raras, huérfanas y transplantes coberturados por FISSAL en el Perú.

data-analysis data-visualization python

Last synced: 24 Feb 2025

https://github.com/s1m0n38/cr-analysis

An exercise in data collection/analysis

clash-royale data-analysis data-collection data-science

Last synced: 08 Jul 2025

https://github.com/arv-anshul/pw-experience-portal

Data Analysis on PW Skills and Ineuron.ai experience/internship portal.

data-analysis experience ineuron-ai internship physics-wallah portal pw-skills python3

Last synced: 16 Apr 2026

https://github.com/jerinpious/house-price-prediction

This project is a machine learning-based application to predict house prices. A frontend interface has been developed using Streamlit to make the prediction process user-friendly for regular customers. The project is structured

data-analysis data-engineering data-science eda machine-learning pandas python random-forest scikit-learn streamlit

Last synced: 05 Apr 2026

https://github.com/muneeb706/r-programming

R-Programming examples for data analysis.

data-analysis r-programming

Last synced: 26 Mar 2025

https://github.com/saksham-jain177/data-analysis

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

api data data-analysis data-cleaning data-science data-visualization datamodeling dataset datasets exploratory-data-analysis python python3 web-scraping youtube-api

Last synced: 01 May 2026

https://github.com/sreejabethu/smart-report-analyzer

An AI-powered app to analyze and summarize Excel, CSV, and PDF reports using Hugging Face language models. Built with Streamlit.

data-analysis huggingface llm nlp pdf-analysis python question-answering streamlit summarization

Last synced: 18 May 2026

https://github.com/sangampaudel530/bhutan-rainfall-explorer

Interactive dashboard to explore, analyze, and forecast rainfall trends in Bhutan (2021–2025) using Streamlit, Plotly, and Prophet.

bhutan climate-change data-analysis prophet-facebook rainfall-prediction streamlit visualization

Last synced: 17 May 2026

https://github.com/iwasakiyuuki/data-analysis-platform-etl

A collection of Airflow DAGs for automating data collection into our on-premises data analysis platform.

airflow airflow-dags data-analysis data-collection

Last synced: 01 Jul 2025

https://github.com/sgb31/covid-19-data-analysis

"In this project, I analyzed COVID-19 data to explore trends, case growth, and key patterns. I worked on cleaning the data, performing exploratory analysis, and visualizing infection rates, recoveries, and fatalities. The goal was to gain insights into how the pandemic evolved and its overall impact.

data-analysis data-visualization matplotlib pandas python seaborn

Last synced: 13 May 2026

https://github.com/anurag-kumar-molankala/sales-performance-dashboard

A Power BI dashboard that analyzes sales trends, product performance, customer segmentation, and payment distribution. It uses DAX, time intelligence, and interactive visuals for data-driven insights. The model includes Sales, Product, and Customer tables for in-depth analysis.

dashboards data-analysis data-visualization dax dax-functions dax-measures dax-query etl-process powerbi powerbi-visuals powerquery sql-query sql-server

Last synced: 03 Apr 2025

https://github.com/jibbs1703/airline-data-analysis

This repository contains the Exploratory Data Analysis of the flight delay and cancellation for airline flights in the United States in the year 2015. With this EDA, insights and solutions are suggested for business owners and airport managers.

business-insights business-solution data-analysis data-visualization

Last synced: 20 Mar 2025

https://github.com/wesleych3n/my-work-log

A self project to record and analyze work's check in/out time on google sheet with telegram bot.

data-analysis telegram-bot worklog

Last synced: 20 Jul 2025

https://github.com/clarajacintho/ig4-ds

The final project for the Multidimensional Data Analysis and Data Mining courses, where we analyze data from motorcyclists to determine what causes accidents

data-analysis data-science shiny-apps

Last synced: 11 May 2025

https://github.com/saadhaniftaj/logistic--lasso-regression-data-analysis

Iris dataset analysis with logistic and Lasso regression, using coordinate descent for feature selection and binary classification. Includes preprocessing and data visualizations

data-analysis lasso-regression-model logistic-regression python statistics

Last synced: 18 May 2026

https://github.com/steviecurran/multi-dish

Scripts to reduce data from large radio telescopes (GMRT, VLA)

data-analysis interferometer pipeline radio-astronomy telescopes

Last synced: 09 May 2026

https://github.com/deliprofesor/2024-salary-analysis-for-machine-learning-engineers

This project analyzes a salary dataset to explore factors like experience, company size, remote work ratio, and country. It includes data cleaning, group analysis, visualizations, and machine learning models (linear regression and Random Forest) to predict salaries and identify key features.

data-analysis data-cleaning data-visualization ggplot2 linear-regression machine-learning plotly r-programming random-forest salary-prediction salary-trends

Last synced: 07 Mar 2026

https://github.com/debjyotisaha/power-bi-projects-phase-1

Portfolio projects related to data visualisation in Power BI

data-analysis data-visualization dax-expression powerbi powerquery

Last synced: 18 Jan 2026

https://github.com/marcosvbras/udacity-nd109-project-titanic

Data Analysis project to Udacity Nanodegree's course: Artificial Intelligence Programming with Python.

data-analysis data-analyst-nanodegree data-science jupyter-notebook machine-learning python udacity

Last synced: 19 May 2026

https://github.com/niniola-creator/niniola-creator

This is a repository that I have created to show my skills, share my projects and track my progress in my data science/web development journey.

bootstrap5 css3 data-analysis data-science data-visualization database html5 javascipt javascript matplotlib pandas powerbi python spreadsheets sql

Last synced: 07 Apr 2026

https://github.com/aelmah/ibm-applied-ds

Find here : A collection of projects I've done throught Applied DS Specialization !

applied-data-science-capstone beautifulsoup data-analysis data-visualization machine-learning python-for-ai-and-data-science web-scraping

Last synced: 11 Sep 2025

https://github.com/fatihilhan42/wnba-draft-player-dataanalysis-1997-2022-with-python

In this project, the statistics of the players in the WNBA drafts from 1997 to 2022 were examined. The data in the dataset, which you can find in the repo, was first organized using data cleaning algorithms. These cleaned data were then graphically extracted using data visualization algorithms.

data-analysis data-analysis-python data-visualization jupyter-notebook python

Last synced: 17 May 2026

https://github.com/chitranjan806/predicting-on-time-premium-deposits

A Predictive analysis project to predict the success rate of On-Time deposits of Premiums by Policy Holders.

analytics-vidhya analytics-vidhya-competition catboostregressor data-analysis data-science linear-regression logistic-regression python3

Last synced: 16 May 2026

https://github.com/a19xys/dm-csgo_analysis

Analysis to address the most important aspects of the knowledge discovery process from data.

data-analysis data-mining data-science dataset jupyter-notebook python

Last synced: 18 May 2026

https://github.com/dhruvil-26/sql-projects

This repository contains SQL projects focusing on data analysis and insights. Currently, it includes: 1. RSVP Movies Analysis - SQL queries to analyze movie trends, ratings, and genres. 2. Pizza Sales Analysis - SQL queries to explore sales patterns, customer behavior, and profitability in a pizza business.

analysis data-analysis database mysql pizza-sales-analysis rdbms rsvp sql

Last synced: 17 May 2026

https://github.com/leoz0214/foodhygieneanalysis

Data analysis regarding Food Hygiene Ratings in England, Wales and Northern Ireland.

data-analysis food-hygiene-ratings pandas python

Last synced: 17 May 2026

https://github.com/1adityakadam/carnegie-classifications-ancestry-grid

A concise, interactive tool for exploring the historical lineage of U.S. higher education institutions using Carnegie Classification data from 1973–2021.

dash data-analysis html javascript pandas python

Last synced: 25 Jun 2025

https://github.com/iamber12/stack-overflow-analysis-using-stack-exchange-api

This Python-based project utilizes the Stack Exchange API to analyze StackOverflow data, focusing on the 'R' and 'Dot Net' programming tags.

data-analysis data-visualization python stack-exchange-api

Last synced: 20 Jul 2025

https://github.com/artemzarubin/xml-document-processor

XML processing tool using the Strategy design pattern.

csharp data-analysis data-transformation design-patterns strategy xml

Last synced: 21 Jul 2025

https://github.com/1adityakadam/Carnegie_classifications_website

A comprehensive data analytics platform analyzing 50+ years of U.S. higher education trends through interactive visualizations and historical institution tracking.

css data-analysis html javascript python ui-design web-development

Last synced: 25 Jun 2025

https://github.com/rohansoni45/movie-recommendation-system

This project is a Content-Based Recommender System that suggests movies to users based on their preferences and watched history. The system leverages cosine similarity to find and recommend movies similar to a selected title. It is built using Python and libraries like Pandas, NumPy, and Scikit-learn.

content-based-filtering cosine-similarity data-analysis data-science machine-learning numpy pandas python recommender-system render scikit-learn

Last synced: 17 Apr 2026

https://github.com/ddihora1604/iitk_task

A comprehensive financial data analysis system that collects, processes, and analyzes data from approximately 500 tickers in the S&P Global Index. It provides detailed financial information, ESG metrics, and various financial statements for comprehensive market analysis.

beautifulsoup4 data-analysis data-visualization datamodelling dataset esg machine-learning python yahoo-finance

Last synced: 30 Jun 2026

https://github.com/pronzzz/diabetes-prediction

Diabetes prediction using a KNN model and Pima Indian Diabetes Dataset

data-analysis data-manipulation data-preprocessing data-visualization knn machine-learning outlier-detection seaborn

Last synced: 13 Apr 2025

https://github.com/jelhamm/singular-value-decomposition-data-mining

"This repository hosts an implementation of the Singular Value Decomposition (SVD) algorithm tailored for data mining tasks. SVD is utilized for efficient dimensionality reduction, aiding in the extraction of key patterns and features from large and complex datasets."

data-analysis dimension-reduction jyputer-notebook machine-learning matplotlib numpy-library pandas-library preprocessing python scipy-library singular-value-decomposition sklearn-library standardscaler svd svd-matrix-factorisation

Last synced: 18 May 2026

https://github.com/dineshdhamodharan24/singapore_flat_resale_

This project focuses on developing a machine learning model to predict the resale values of apartments in Singapore. The goal is to create a user-friendly online application that enables users to obtain accurate predictions for the resale values of specific properties.

data-analysis flat json numpy pandas pickle project python streamlit

Last synced: 07 Apr 2026

https://github.com/dinamohsin/ai-job-market-analysis-using-sql-excel

This project explores a dataset of AI-related jobs to uncover insights about salary trends, in-demand skills, education levels, and remote work preferences. The analysis was done using SQL for querying and Excel for data cleaning and preparation.

data-analysis data-preprocessing excel functions query sql sql-server

Last synced: 25 Jun 2025

https://github.com/vbhvsingh0/nflteam_corr_population

The goal of this project is to find the correlation in between NFL teams' win and loss with the population of the city.

data-analysis data-cleaning-and-preprocessing data-manipulation-with-pandas numpy-library pandas-python pearson-correlation python3

Last synced: 29 Jun 2026

https://github.com/dsite42/simple_data_visualizer

This is a simple tool to visualize data for a quick Exploratory Data Analysis (EDA). You can create various plot types as seaborn or plotly plot via a GUI in multiple windows (RelPlot, PairPlot, JointPlot, DisPlot, CatPlot, LmPlot, 3DPlot).

data-analysis data-science data-visualisation data-visualization eda exploratory-data-analysis plotly seaborn

Last synced: 12 May 2026

https://github.com/josafary-ds/curso_dnc

Repositório para armazenamento dos arquivos de estudo e projetos DNC - Cientista de Dados

data-analysis data-science data-visualization machine-learning powerbi python

Last synced: 13 Mar 2025

https://github.com/harmanveer-2546/motor-vehicle-accidents-in-india

As per the report, a total of 4,61,312 road accidents have been reported by States and Union Territories (UTs) during the calendar year 2022, which claimed 1,68,491 lives and caused injuries to 4,43,366 persons.

accidents accidents-analysis darkgrid data-analysis eda exploratory-data-analysis indian-roads inline matplotlib motor-vehicles numpy pandas review seaborn visualization

Last synced: 19 Jan 2026

https://github.com/mrendiks/analyst-data-survey-monkey

Learn how to analyst data from dataset surver monkey using Excel and Python

data-analysis ipynb-jupyter-notebook python

Last synced: 07 Mar 2026

https://github.com/chahelgupta/fitness-data-analysis-r-project

This project focuses on analyzing fitness data collected from various tracking devices to gain insights into users' activity levels, sleep patterns, calorie expenditure, and heart rate. The dataset used in this project consists of multiple CSV files, each containing different aspects of fitness-related data.

data-analysis data-cleaning data-exploration data-science data-visualization r r-language r-programming r-studio

Last synced: 18 May 2026

https://github.com/jonathancaleb/adap

📊🌱 Agricultural Data Analysis Platform 🌍🚜 A personal initiative to analyze coffee growth trends in Uganda using Python, data science, and machine learning. This project supports sustainable farming with predictive models and interactive visualizations. 🍃📈

data-analysis data-science python

Last synced: 18 May 2026

https://github.com/sharoonjoseph321/social_media_eda

Data Analysis on social media apps ,using pandas, python, matplotlib.

data data-analysis data-science data-visualization matplotlib programming-language project python pythonprojects

Last synced: 03 Mar 2025

https://github.com/majajuri/text-classification-using-string-kernels

Projekt u sklopu predmeta Uvod u znanost o podacima

data-analysis string-kernel

Last synced: 05 Apr 2025

https://github.com/kavicastelo/colab

This repository includes a data analysis and model training practical Jupyter notebooks using a soil fertilizer dataset. (use 4th edition)

data-analysis jupyter-notebook python

Last synced: 26 Mar 2025

https://github.com/martachesnova/python

Created a Python script to calculate and analyze financial records of a company. Created another Python script to do calculations and analysis of the voting process in a small town.

data-analysis python

Last synced: 24 Apr 2026

https://github.com/jm199504/data-analysis-practice

数据分析练习(Titanic / BankCustomers)

data-analysis python

Last synced: 02 May 2026

https://github.com/lparham2/factors-driving-ev-adoption-charging-station-deployment

This project explores factors driving EV adoption and charging station deployment using Python-based data analysis. It examines sales trends, infrastructure growth, and socioeconomic influences to uncover key insights. The goal is to aid policymakers and businesses in optimizing EV infrastructure and accelerating sustainable transportation.

data-analysis data-visualization electric-vehicle-charging-station electric-vehicles powerpoint-presentations python

Last synced: 18 May 2026

https://github.com/antononcube/wl-mosaicplot-paclet

Wolfram Language (aka Mathematica) paclet for mosaic plots over datasets or lists of records.

data-analysis machine-learning mosaic mosaic-plots

Last synced: 16 Jan 2026

https://github.com/preciousclement/maternal-experiences-in-nigeria

This repository contains a Python-based project that generates realistic synthetic data simulating the maternal health journey of 5,000 women in Nigeria.

data-analysis data-generation maternal-health nigeria public-health python

Last synced: 08 May 2025

https://github.com/malexandersalazar/tools-python-mssql-statistics-descriptor

A lightweight tool based on sweetviz that generates high-density visualizations to kickstart Exploratory Data Analysis within Microsoft Azure SQL Database using ODBC with just one line of code

azure-sql-database data-analysis data-visualization eda python

Last synced: 16 May 2026

https://github.com/wikidata/purdue-data-mine-2024

Program materials for WMDE's 2024 Purdue Data Mine project

analytics data-analysis data-quality data-science etl open-data python wikidata wikimedia

Last synced: 12 May 2025

https://github.com/shellynagar27/mobile-sales-analysis

Analyzed 2024 mobile sales data to uncover product trends, customer behavior, and regional insights using Power BI dashboards and structured data modeling.

cleaning-data data-analysis data-visualization dax eda figma modelling powerbi powerquery storytelling wireframe

Last synced: 16 May 2025

https://github.com/pyramidheadshark/ai-mirea-sem1p

Completed set of all MIREA AI an DA practices (1 sem.)

beginner-friendly data-analysis data-science jupyter mirea

Last synced: 05 Apr 2025

https://github.com/codeonthespectrum/web-scrap

Este projeto realiza o web scraping da Wikipédia para obter dados sobre os municípios mais populosos do estado do Rio de Janeiro.

data-analysis data-visualization webscraping

Last synced: 16 Feb 2026

https://github.com/shubhamprajapati7748/end-to-end-house-price-prediction

A machine learning model that accurately predicts housing prices using the Boston Housing dataset by analyzing various house features, and it utilizes a CatBoost model to assist potential buyers or sellers in estimating housing prices.

boston-housing-price-prediction data-analysis data-science-projects machine-learning regression regression-models

Last synced: 30 Oct 2025

https://github.com/mh0386/motorcycle_data_analysis

Data analysis applied to motorcycle dataset.

data-analysis

Last synced: 19 Jul 2025

https://github.com/geoninja/reddit_data_analysis

Data analysis application presented at the 2016 NTC (Non-profit Technology Conference) in San Jose, CA.

data-analysis python reddit-data-analysis text-analysis

Last synced: 03 May 2026

https://github.com/alexjackson1/commons-indicative-votes

A cluster analysis of the House of Commons' Indicative Brexit Voting Process on 27 Match 2019

data-analysis politics r

Last synced: 19 Jul 2025

https://github.com/adriangalvanzamora/ecommerce-analytics-olist

Data analysis project based on the Olist Brazilian E-Commerce dataset. Includes data cleaning, exploratory analysis, delivery performance metrics, customer satisfaction modeling, and geospatial insights. Built entirely in Python (Jupyter Notebook) using real-world data from Kaggle.

brazil customer-satisfaction data-analysis data-visualization ecommerce folium geospatial-analysis machine-learning matplotlib notebook pandas plotly python seaborn

Last synced: 06 May 2026

https://github.com/zen204/accenture-tech-news-summarization-engine

A tool developed to analyze knowledge graphs from technology news articles, uncovering insights and trends about technology products, platforms, services, and their industry impact. Built during an internship at Accenture to inform decision-making in the tech landscape.

data-analysis decision-making graph-visualization industry-insights jupyter-notebook knowledge-graph machine-learning python tech-news tech-trends

Last synced: 29 Apr 2026