An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/debjyotisaha/power-bi-projects-phase-2

Created interactive dashboards and reports using Power BI to visualize complex datasets. Demonstrated proficiency in data modelling, DAX calculations, and storytelling through data to provide actionable insights.

dashboards data-analysis data-modeling data-visualisation power-query powerbi

Last synced: 18 Jan 2026

https://github.com/jlee9503/telecommunication-churn

Analyze key factors influencing customer churn using Python data analytics technique. Explore key factors through data preprocessing, exploratory data analysis (EDA), and predictive modeling.

data-analysis data-visualization matplotlib pandas python scikit-learn

Last synced: 18 Jan 2026

https://github.com/amish5ingh/cricket-data-analytics-ipl

Data analysis and visualization of IPL 2022 matches using Python, Pandas, Matplotlib, and Seaborn. Includes insights on match outcomes, player performances, toss trends, and venue stats with 12+ charts.

data-analysis data-visualization ipl-data-analysis ipl-data-visualization jupiter-notebook matplotlib-pyplot numpy pandas python seaborn

Last synced: 09 May 2026

https://github.com/izzyl3333/mosquito_analysis

An exercise using Python and statistical analysis in mosquito data to understand the relationship between the different variables and the mosquito number.

chicago data-analysis data-science exploratory-data-analysis mosquitoes python statistical-analysis west-nile-virus

Last synced: 19 Jan 2026

https://github.com/marianamartiyns/api-logisticregression

Data analysis, modeling, and deployment of a logistic regression model for churn prediction, integrating a FastAPI backend and a Streamlit frontend.

data-analysis data-science fastapi logistic-regression pyhton streamlit

Last synced: 29 Apr 2026

https://github.com/takshshah-16/pizza_sales_sql

SQL-powered pizza sales analytics project using MySQL Workbench to derive business insights through data exploration and queries.

business-intelligence data-analysis database-management mysql sql

Last synced: 09 Oct 2025

https://github.com/mirwais-farahi/data-visualization-with-tableau-specialization

The Specialization provides Tableau for data visualization and business intelligence. The series covers skills like assessing data quality, designing visualizations and dashboards, and combining data sources to create compelling, data-driven stories.

dashboard data-analysis geospatial map tableau visualization

Last synced: 16 Feb 2026

https://github.com/sillyash/untappd-viz

A data visualisation page using public datasets and HTML/CSS/JS with D3.js.

beer beer-statistics data data-analysis data-visualization kaggle kaggle-dataset public-dataset school-project

Last synced: 18 May 2026

https://github.com/adithya2369/safa_public

AI-powered customer feedback analyzer that uses generative AI to transform customer reviews into actionable business insights. Upload review data, get instant summaries, satisfaction scores, detailed reports, and improvement suggestions—all in an easy-to-deploy Docker container.

data-analysis data-visualization docker-containerization full-stack-development generative-ai langchain langchain-groq web-development

Last synced: 10 Oct 2025

https://github.com/atiqisrak/py

This repository houses the code and resources for the **100 Days of Python Challenge** – an intensive learning journey designed to propel you from beginner to a a confident Python programmer in just 100 days.

data-analysis data-science machine-learning python3

Last synced: 10 Oct 2025

https://github.com/samuelsoaress/wkd-default-reduction

reduction of default from 35% to 25% or less with machine learning techniques

data-analysis data-exploration data-science machine-learning-algorithms

Last synced: 10 Oct 2025

https://github.com/sabdikay/analysis-of-biodiversity

This project analyzes biodiversity data from the National Parks Service, focusing on species in various park locations. Conducted in Jupyter Notebook, it uses pandas, matplotlib, NumPy, seaborn, and chi2_contingency for analysis and visualization.

data-analysis data-analysis-python data-visualization exploratory-data-analysis jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 14 Apr 2026

https://github.com/ibromeat/road-accident-risk

Exploratory Data Analysis of road accident risk predictions — visualizing model stability and distribution of predicted probabilities.

data-analysis jupyter-notebook matplotlib python traffic-data visualization

Last synced: 18 May 2026

https://github.com/brooks-code/toulouse-biblio-chronicle

Snapshot of Toulouse public library customer habits — cleaning raw, messy datasets of musical, cinematic, and literary checkouts; includes data-cleaning steps, analysis notebook revealing cultural tastes in the Pink City.

data-analysis data-cleaning data-cleaning-and-preprocessing data-quality exploratory-data-analysis jupyter-notebook library-data misaligned-data mojibake tutorial

Last synced: 10 Oct 2025

https://github.com/filipe-rds/bi-atividade-1

Atividade de análise de dados para a disciplina de Inteligência Empresarial

data-analysis jupyter-notebook python

Last synced: 15 May 2026

https://github.com/badranalyst/time-series-analysis-of-global-trends-in-diet-gym-and-finance

This project analyzes global trends in diet, gym, and finance over time using time series data. The analysis is performed using Python libraries like Pandas, Matplotlib, and Seaborn to visualize trends and identify patterns in these sectors across various countries.

data-analysis dataset matplotlib-pyplot numpy pandas python seaborn time-series

Last synced: 14 Apr 2026

https://github.com/sharvesh1401/battsense

BattSense is a machine learning project focused on predicting the State of Health (SOH) of lithium-ion batteries using operational parameters such as voltage, current, temperature, and capacity. The model enables accurate, data-driven diagnostics for battery performance monitoring in electric vehicles and portable devices.

battery-diagnostics battery-health battery-health-prediction battery-soh data-analysis electric-vehicles energy-storage machine-learning predictive-maintenance python regression scikit-learn

Last synced: 07 May 2026

https://github.com/its-ekanshi/sql-analytics-project

Designed relational tables with primary and foreign keys, populated with sample data for real-world testing. Implemented advanced SQL techniques such as CTEs, window functions, aggregates, and filters to extract valuable insights.

business-intelligence data-analysis exploratory-data-analysis microsoft-sql-server sql sql-queries

Last synced: 10 Oct 2025

https://github.com/salma-mamdoh/a-visual-history-of-nobel-prize-winners-project

My project aims to practice Data Analysis and Data Visualization on DataCamp

data-analysis data-visualization datacamp matplotlib pandas python seaborn

Last synced: 04 May 2026

https://github.com/salma-mamdoh/the-android-app-market-on-google-play-project

My project aims to practice Data Analysis and Data Visualization on DataCamp

data-analysis data-visualization datacamp jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 12 Apr 2026

https://github.com/frankelavsky/security-dash-challenge

I had two 8 hour days to create a visualization dashboard for three datasets. Tab one: Voronoi overlay on line graph. Tab two: Data partitioning method keeps in-memory usage low. Tab three: deals with "Failed" vs "Successful" attempts as positive/negative barcharts over time. I used d3.js, require, MVC pattern, and vanilla js.

client-side complexity css3 d3 d3js dashboard data-analysis data-structures-algorithms data-visualization frontend-app html5 interactive-visualizations javascript modular network-analysis network-monitoring network-security security single-page-app visualization

Last synced: 14 Apr 2026

https://github.com/cyberoctane29/diamonds-anova-analysis

This project uses ANOVA in Python to analyze how diamond color and cut affect pricing. By testing for statistical significance and running post hoc comparisons, it reveals key pricing patterns. Built with pandas, statsmodels, and Seaborn, the findings help inform diamond valuation and purchasing decisions.

anova-test data-analysis data-analytics data-science diamonds-dataset regression-analysis statistical-analysis tukey-hsd

Last synced: 10 Oct 2025

https://github.com/jrdnbradford/the-office-us

Data concerning NBC's mockumentary series The Office (U.S. version)

csv data-analysis json the-office xml

Last synced: 19 Jan 2026

https://github.com/priyanshubiswas-tech/deloitte-daikibo-telemetry-analysis-task-1

Tableau dashboard analyzing Daikibo telemetry data. Tracks downtime by factory/device with interactive filters. Deloitte task solution with JSON processing.

data-analysis data-visualization deloitte json tableau tableau-public

Last synced: 11 Oct 2025

https://github.com/saifalibaig/covid-19-death-rate-analysis-using-python

Analysis of Covid-19 data along with the world happiness report to identify if there is any relationship between death rate and happiness rate of countries all over the world.

data-analysis data-visualization numpy pandas python3 sns visualization

Last synced: 03 May 2026

https://github.com/azaz9026/email-spam-detection

Welcome to the Email Spam Detection project! This repository provides a machine learning model for detecting spam emails using a Naive Bayes classifier and a simple web interface built with Streamlit.

data-analysis data-cleaning data-structures data-visualization deep-learning machine-learning python sql streamlit

Last synced: 14 Apr 2026

https://github.com/mouadtaoussi/capmpingi-employee-reviews

Analysis of Capmpingi employee reviews using Python/Pandas and Power BI

data-analysis data-science kaggle pandas powerbi python python3

Last synced: 14 Apr 2026

https://github.com/kianaasd93/sensors-

Data Analysis of wearable technologies autonomous systems sensor in physiotherapy, Conducted a comprehensive data analysis on Xsens MTx sensor data

classification data-analysis data-science jupyter jupyter-notebook knn machine-learning physiotherapy python sensor svm wearable-devices wearable-technology

Last synced: 19 Feb 2026

https://github.com/vinay-jose/territorial-sales-dashboard

EDA was carried out in the sales data of Atliq Technologies and a Dashboard was created in PowerBI to draw insights.

data-analysis data-visualization powerbi-desktop sql

Last synced: 11 Oct 2025

https://github.com/ahsankhizar5/titanic-eda-visualization

Exploratory Data Analysis and Visualization on the Titanic Dataset using Python, Pandas, Matplotlib, and Seaborn to uncover survival patterns.

data-analysis data-science data-visualization eda kaggle machine-learning matplotlib pandas python seaborn titanic-dataset

Last synced: 31 May 2026

https://github.com/bakulwani/data-mart-weekly-sales

Cleaned and analyzed weekly sales data using SQL to build a business-focused data mart with KPIs, customer segmentation, and platform insights.

customer-segmentation data-analysis data-cleaning etl kpi-analysis mysql sales-analysis sql

Last synced: 21 Feb 2026

https://github.com/thinzarhninyu/dap

Notes and Projects for Data Analysis with Python course from FreeCodeCamp.org

data-analysis data-analysis-python ipynb jupyter-notebook python

Last synced: 18 Feb 2026

https://github.com/dzakwanalifi/stadata-x

Terminal UI untuk menjelajahi dan mengunduh data BPS Indonesia secara interaktif

bps-api cli-app data-analysis data-visualization indonesia-statistics indonesian-data open-data python statistics terminal-ui textual tui

Last synced: 20 Jan 2026

https://github.com/NurFakhri/scraping-and-analysis-skincare

Scraping and data analysis of Indonesian skincare reviews.

beutifulsoup data-analysis data-scraping python requests review scraping-websites

Last synced: 12 Oct 2025

https://github.com/tzerk/esr

R package 'ESR' for plotting and analysing ESR spectra in dating applications

data-analysis data-visualization electron-spin-resonance geochronology r

Last synced: 13 Mar 2026

https://github.com/akash1070/project--uber-data-analysis

To Determine UBER data from the dataset using Python

data-analysis data-science python

Last synced: 09 May 2026

https://github.com/leosimoes/digitalinnovationone-analise-covid

Projeto prático "Criando modelos com Python e Machine Learning para prever a evolução do COVID-19 no Brasil" da Digital Innovation One.

arima-models data-analysis data-science python time-series

Last synced: 09 May 2026

https://github.com/veronsheva/hr_dashboards

Interactive HR dashboard using Tableau & MySQL – explore employee trends, performance, attrition, and salary insights.

calculated-fields charts cte dashboards data-analysis data-cleaning design eda mysql queries tableau window-functions

Last synced: 24 Jan 2026

https://github.com/agb2k/twitter-analyzer

Project to extract tweets based on searches, analyze it's data and autocorrect potentially incorrect words

data-analysis python tweepy twitter

Last synced: 13 Oct 2025

https://github.com/sunsided/esc2024

Exploratory Data Analysis on the ESC 2024 results

csv data-analysis eurovision-song-contest scraping

Last synced: 18 Feb 2026

https://github.com/louisfernando1204/websocket-benchmark

A comprehensive performance testing and analysis suite designed to evaluate and compare different WebSocket server implementations across various programming languages and libraries.

benchmarking broadcast-test coder-websocket csv data-analysis data-visualization echo-test golang gorilla-websocket nodejs python3 socket-io websocket-client websocket-server ws

Last synced: 09 Apr 2026

https://github.com/stefagnone/-employee-salary-analysis-and-insights

Predictive analysis of employee salary determinants for an anonymized dataset, highlighting key factors influencing salary and providing insights for salary policy improvements.

business-intelligence data-analysis data-science employee-salary-analysis excel gender-pay-gap predictive-insights regression-modeling spss statistical-analysis

Last synced: 23 Feb 2026

https://github.com/alefrp/properties_dbt

A DBT project for analyzing city property data.

data-analysis data-warehouse dbt python sql

Last synced: 13 Oct 2025

https://github.com/jsimell/sleepanalysis

A Python data analysis project analyzing the sleep quality affecting factors and temporal patterns in the sleeping data of a single subject.

data-analysis matplotlib numpy pandas python scikit-learn seaborn

Last synced: 14 Apr 2026

https://github.com/angelalim88/jakarta-air-quality-index-classification

This project classifies Jakarta's Air Quality Index (AQI) from 2010 to 2023 using machine learning models (Random Forest, MLP, SVM) based on pollutant concentrations.

data-analysis data-visua machine-learning scikit-learn tensorflow

Last synced: 13 Oct 2025

https://github.com/gmalbert/rugby

Rugby Data Analysis and Sports Betting

data-analysis rugby sports-betting

Last synced: 31 May 2026

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 21 Jan 2026

https://github.com/sumit9000/submission-of-web-server-log-analysis-assessment

This project analyzes one year of real-world HTTP access logs from the University of Calgary’s computer science server. Using Python, pandas, and regular expressions, we clean and parse the data to extract meaningful insights and answer 10 analytical questions.

data-analysis data-cleaning eda jupyter-notebook log-parsing pandas python realworld-data regex web-log-analysis

Last synced: 14 Apr 2026

https://github.com/giseletoledo/case-study-wellness-smart

Project from the coursera course Google Data Analytics

data-analysis kaggle-dataset r

Last synced: 14 Oct 2025

https://github.com/samkazan/business-analysis-tableau

Business Analysis on Global/Superstore data using Tableau.

analysis data-analysis tableau visualization

Last synced: 08 Feb 2026

https://github.com/anushkundu/london-housing-market-analysis

London Housing Market Analysis: An Insightful Power BI Dashboard"

data-analysis data-visualization powerbi transformation

Last synced: 27 Jan 2026

https://github.com/ayorick23/python-data-science-cheat-sheet

Guía rápida y práctica de sintaxis, comandos y funciones esenciales de Python para Ciencia de Datos. Perfecta para recordar cómo usar las librerías más comunes como NumPy, Pandas, Matplotlib y Scikit-learn en tus análisis diarios.

cheat-sheet data-analysis data-science data-visualization deep-learning jupyter-notebook machine-learning matplotlib ml numpy pandas python scikit-learn scipy seaborn statistics sympy tensorflow

Last synced: 07 Apr 2026

https://github.com/asuquoaa/air_bnb_analysis_dashboard-tableau-

Interactive Tableau dashboards to analyze and visualize data, providing actionable insights for better decision-making

dashboard data-analysis interactive-visualization tableau

Last synced: 13 Mar 2026

https://github.com/ankitpoddar07/excel-project_back-office

📊 Coffee Sales Analytics – Back Office Excel Project

data-analysis ms-excel

Last synced: 05 Feb 2026

https://github.com/saisurajmatta/healthcare-data-analytics

Power BI project analyzing Emergency Department data, demonstrating skills in data transformation, DAX, and visualization. It focuses on patient flow, wait times, demographics, and satisfaction, providing actionable insights for healthcare improvement. Includes documentation, data dictionary, and code samples.

data-analysis data-modeling data-visualization dax power-bi powerbi-visuals powerquery

Last synced: 22 Jan 2026

https://github.com/saisurajmatta/e-commerce-sales-advanced-data-analysis

Excel-based e-commerce analytics for FNP, a gift company. It covers data extraction, modeling, and visualization, providing actionable insights on revenue, customer behavior, and operations. Key skills include Excel, Power Query, Power Pivot, and DAX. The analysis culminates in data-driven business recommendations.

data-analysis data-visualization dax excel power-pivot power-query

Last synced: 22 Jan 2026

https://github.com/a26nine/msc-dissertation-bitcoin-dashboard

An interactive data visualisation dashboard built using Tableau Desktop to research and analyse the relationship between the price volatility and adoptability of bitcoin.

data-analysis data-science data-visualization tableau tableau-desktop tableau-prep

Last synced: 17 Feb 2026

https://github.com/rohanrony19/movie-recommendation-system

This is a python project where using Pandas library we will find correlation and give the best recommendation for movies.

data-analysis deep-learning knn-algorithm numpy pandas python recommendation-system

Last synced: 14 Apr 2026

https://github.com/pedrosfaria2/analisetitulosnetflix

Estudo de popularidade dos filmes da Netflix no IMDB.

analise-de-dados data-analysis jupyter-notebook matplotlib numpy pandas python

Last synced: 14 Apr 2026

https://github.com/sanjayankur31/20181206-neurofedora

Slides for my NeuroFedora seminar at the UH Biocomputaiton group's weekly seminar

computational-neuroscience data-analysis neurofedora neuroimaging neuroscience open-science

Last synced: 19 Feb 2026

https://github.com/virajbhutada/hollywood-insights-tableau

Strategic cinematic insights through Hollywood's data landscape. Tableau-driven analytics for genre, studio profitability, and audience dynamics. Uncover trends, assess audience reception, and navigate through years of film data, elevating your understanding of the cinematic world.

analystics business-intelligence dashboard data-analysis data-visualization entertainment hollywood storytelling tableau tableau-desktop visualization

Last synced: 05 Feb 2026

https://github.com/kunalpisolkar24/winequalityprediction

Predicting wine quality using machine learning with matplotlib, numpy, pandas, and seaborn for insightful data analysis. 🍇🤖📊

data-analysis data-science data-visualization machine-learning prediction-model

Last synced: 16 Oct 2025

https://github.com/hase3b/flask-dash-interactive-dashboard

An interactive data visualization dashboard created using Flask and Dash. This project includes comprehensive data preparation, exploratory data analysis (EDA), and dynamic visualizations with Seaborn and Plotly. Explore the multi-page Dash app with features like dropdowns and callbacks for updated plots.

callbacks dash dashboard data-analysis data-visualization dropdown eda flask interactive plotly seaborn web-app

Last synced: 19 May 2026

https://github.com/fatihilhan42/nba-players-data-1950-to-2021

In this project, the data of the NBA players between the years 1950-2021 were examined. After the NBA players' season, height, performance, averages of points, teams and positions they played were obtained through csv files, important tables and graphs were created using data cleaning and data visualization algorithms.

data data-analysis data-engineering data-science data-visualization

Last synced: 16 Oct 2025

https://github.com/supertetelman/coursera-exdata-09

This repo contains several R scripts that were used to analyze, plot, and clean data from various datasets. These projects were part of the Coursera course, Exploratory Data Analysis. The end results of the analysis are included.

big-data course coursera data-analysis r

Last synced: 16 Oct 2025

https://github.com/balajimohan18/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language

data-analysis data-mining data-science eta microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries

Last synced: 14 Mar 2026

https://github.com/mindlessmuse666/iris-ml-based-on-decision-trees

Проект демонстрирует применение моделей машинного обучения на основе деревьев решений и случайного леса для классификации набора данных Iris. Включает в себя загрузку данных, обучение моделей, оценку производительности и визуализацию результатов. Предназначен для изучения основ машинного обучения и анализа данных.

classification data-analysis data-visualization decision-trees iris-dataset machine-learning model-evaluation python random-forest scikit-learn

Last synced: 17 Oct 2025

https://github.com/pizofreude/da-with-r

Data analysis with R data centric programming language

data-analysis r

Last synced: 17 Oct 2025

https://github.com/prateek5525/online-shopping-analytics-project

The Online Shopping Analytics Project analyzed product trends, and regional sales using SQL and Tableau. Insights from the Sales and Location Dashboards highlighted key trends in demographics, product popularity, and regional performance. These findings empower businesses to optimize strategies, enhance marketing, and improve inventory management.

data-analysis excel kaggle-dataset sql tableau

Last synced: 20 Feb 2026

https://github.com/codeslash21/communicate_data_findings

Analyze and visualize Bay Wheel system data which contains 2.5M individual trips data. And communicate the data findings from the dataset in the form notebook slide.

bay-wheel data-analysis data-visualization explanatory-data-visualization exploratory-data-analysis

Last synced: 22 Jan 2026

https://github.com/abhijeet107/task-4

Design an interactive dashboard for business stakeholders.

data-analysis excel-csv tableau-dashboards tableau-public

Last synced: 22 Jan 2026

https://github.com/casassg/ms_thesis

Social Media Analysis for Crisis Informatics in the Cloud

casassg-thesis data-analysis google-cloud kubernetes

Last synced: 19 Oct 2025

https://github.com/farhad-here/predict_student_performance

Predict Student Performance, is a data analysis and machine learning project aimed at predicting students' final performance (g3) based on demographic, family, and academic features. The project supports both Regression (predicting exact grades) and classification (Pass/Fail categories).

classification data-analysis data-visualization linear-regression machine-learning numpy pandas postgresql powerbi scikit-learn streamlit

Last synced: 14 Apr 2026

https://github.com/Kaushik-Puttaswamy/Airline-Passenger-Referral-Prediction-Using-Machine-Learning

This project uses a machine learning model to predict if passengers referred by existing customers will book a flight, helping airlines target likely customers. Key factors like service ratings and value for money drive predictions, achieving over 90% accuracy.

airline-marketing customer-referral-prediction customer-satisfaction data-analysis feature-engineering hyperparameter-tuning machine-learning model-evaluation predictive-analytics

Last synced: 20 Oct 2025

https://github.com/mtimma001/clinical-trial-data-tool-v2

Clinical Trial Data Analysis Tool is a Flask-based web app for healthcare professionals to manage and analyze clinical trial data. It features full CRUD functionality, interactive visualizations (Plotly/Matplotlib), a responsive Bootstrap UI, MySQL database integration, and Heroku deployment for accessible, scalable use.

bootstrap5 clinical-trials crud data-analysis data-visualization flask healthcare heroku mysql pandas plotly python

Last synced: 14 Apr 2026

https://github.com/ashwin331133/sql-pizza-outlet-sales-analysis

This project analyzes pizza sales data to gain insights into customer behavior and revenue patterns. Key analyses include customer insights, popular pizza types and sizes, revenue generation, and order trends. The findings help optimize menu offerings, staffing, and marketing strategies to boost overall business performance.

data-analysis sql

Last synced: 24 Feb 2026

https://github.com/mothraa/etl-marketanalysis-webscraping-poo

OC project 2 refactoring (POO version not yet completed)

data-analysis etl poo python web-scraping

Last synced: 20 Oct 2025

https://github.com/saisurajmatta/nashville-housing-data-cleaning-project

Clean and standardize Nashville Housing dataset using SQL queries for improved data quality and structure.

azure-data-studio data-analysis mssql mysql sql sql-data-cleaning sql-queries sql-server-management-studio

Last synced: 23 Jan 2026

https://github.com/scbirlab/hts-tools

🏮 Parsing and analysing platereader absorbance and fluorescence data.

assay-analysis data-analysis fluorescence high-throughput high-throughput-screening platereader

Last synced: 23 Jan 2026

https://github.com/navp7/hr_analysis_excel

This project utilizes Microsoft Excel to conduct a comprehensive analysis of HR data, focusing on identifying the various reasons for employee attrition and evaluating job satisfaction

dashboards data-analysis excel visualization

Last synced: 23 Jan 2026