An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/ibromeat/road-accident-risk

Exploratory Data Analysis of road accident risk predictions — visualizing model stability and distribution of predicted probabilities.

data-analysis jupyter-notebook matplotlib python traffic-data visualization

Last synced: 18 May 2026

https://github.com/brooks-code/toulouse-biblio-chronicle

Snapshot of Toulouse public library customer habits — cleaning raw, messy datasets of musical, cinematic, and literary checkouts; includes data-cleaning steps, analysis notebook revealing cultural tastes in the Pink City.

data-analysis data-cleaning data-cleaning-and-preprocessing data-quality exploratory-data-analysis jupyter-notebook library-data misaligned-data mojibake tutorial

Last synced: 10 Oct 2025

https://github.com/filipe-rds/bi-atividade-1

Atividade de análise de dados para a disciplina de Inteligência Empresarial

data-analysis jupyter-notebook python

Last synced: 15 May 2026

https://github.com/badranalyst/time-series-analysis-of-global-trends-in-diet-gym-and-finance

This project analyzes global trends in diet, gym, and finance over time using time series data. The analysis is performed using Python libraries like Pandas, Matplotlib, and Seaborn to visualize trends and identify patterns in these sectors across various countries.

data-analysis dataset matplotlib-pyplot numpy pandas python seaborn time-series

Last synced: 14 Apr 2026

https://github.com/its-ekanshi/sql-analytics-project

Designed relational tables with primary and foreign keys, populated with sample data for real-world testing. Implemented advanced SQL techniques such as CTEs, window functions, aggregates, and filters to extract valuable insights.

business-intelligence data-analysis exploratory-data-analysis microsoft-sql-server sql sql-queries

Last synced: 10 Oct 2025

https://github.com/salma-mamdoh/the-android-app-market-on-google-play-project

My project aims to practice Data Analysis and Data Visualization on DataCamp

data-analysis data-visualization datacamp jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 12 Apr 2026

https://github.com/cyberoctane29/diamonds-anova-analysis

This project uses ANOVA in Python to analyze how diamond color and cut affect pricing. By testing for statistical significance and running post hoc comparisons, it reveals key pricing patterns. Built with pandas, statsmodels, and Seaborn, the findings help inform diamond valuation and purchasing decisions.

anova-test data-analysis data-analytics data-science diamonds-dataset regression-analysis statistical-analysis tukey-hsd

Last synced: 10 Oct 2025

https://github.com/jrdnbradford/the-office-us

Data concerning NBC's mockumentary series The Office (U.S. version)

csv data-analysis json the-office xml

Last synced: 19 Jan 2026

https://github.com/pranav016/exploratory-data-analysis-of-google-app-store-dataset

This is a data analysis done on the Google app store dataset to answer a few questions related to the data through data visualization techniques.

data-analysis

Last synced: 11 Oct 2025

https://github.com/priyanshubiswas-tech/deloitte-daikibo-telemetry-analysis-task-1

Tableau dashboard analyzing Daikibo telemetry data. Tracks downtime by factory/device with interactive filters. Deloitte task solution with JSON processing.

data-analysis data-visualization deloitte json tableau tableau-public

Last synced: 11 Oct 2025

https://github.com/saifalibaig/covid-19-death-rate-analysis-using-python

Analysis of Covid-19 data along with the world happiness report to identify if there is any relationship between death rate and happiness rate of countries all over the world.

data-analysis data-visualization numpy pandas python3 sns visualization

Last synced: 03 May 2026

https://github.com/azaz9026/email-spam-detection

Welcome to the Email Spam Detection project! This repository provides a machine learning model for detecting spam emails using a Naive Bayes classifier and a simple web interface built with Streamlit.

data-analysis data-cleaning data-structures data-visualization deep-learning machine-learning python sql streamlit

Last synced: 14 Apr 2026

https://github.com/vinay-jose/territorial-sales-dashboard

EDA was carried out in the sales data of Atliq Technologies and a Dashboard was created in PowerBI to draw insights.

data-analysis data-visualization powerbi-desktop sql

Last synced: 11 Oct 2025

https://github.com/silvermete0r/sdu_hackathon_uss_db_analysis

Smart Data Ukimet Hackathon - "Data Modeling" case Solution - Topic: Store Analysis based on Unified Star Schema

data-analysis data-modeling postgresql python sql unified-star-schema

Last synced: 14 Apr 2026

https://github.com/navp7/pizzasales_powerbi

This project involves creating a comprehensive sales performance dashboard using Power BI to visualize and analyze the sales data of an Italian pizza company.

data-analysis ms-sql-server ms-word powerbi visualization

Last synced: 13 Mar 2026

https://github.com/dzakwanalifi/stadata-x

Terminal UI untuk menjelajahi dan mengunduh data BPS Indonesia secara interaktif

bps-api cli-app data-analysis data-visualization indonesia-statistics indonesian-data open-data python statistics terminal-ui textual tui

Last synced: 20 Jan 2026

https://github.com/abeltavares/postql

Python library and command-line interface (CLI) tool for interacting with PostgreSQL databases, providing simplified database management, query execution, and result export functionalities.

cli command-line-interface data-analysis data-engineering data-export data-management data-processing data-visualization database database-administration database-tools etl oop postgres postgresql psycopg2 python sql sqlalchemy wrapper

Last synced: 19 Jan 2026

https://github.com/treasarose/us_candy_distribution_analysis_project

This project focuses on advanced data analysis and optimization using SQL. It includes queries for analyzing sales, product margins, and shipping efficiency for a US candy distributor.

data-analysis entity-relationship mssql optimization query sql-server sqlproject us-candy-distributor

Last synced: 12 Oct 2025

https://github.com/jeffbrennan/analysis-templates

Templates of commonly used graphics/functions/settings to help focus on the bigger picture

data-analysis r rmd

Last synced: 12 Oct 2025

https://github.com/akash1070/project--uber-data-analysis

To Determine UBER data from the dataset using Python

data-analysis data-science python

Last synced: 09 May 2026

https://github.com/alexondata/daan_eda-exploratory-data-analysis_ecommerce

This project presents an Exploratory Data Analysis (EDA) pipeline for an eCommerce dataset, integrating Python, SQL Server, and Power BI to transform raw transactional data into meaningful business insights. The project was developed as part of an academic assignment at Transilvania University of Brașov, Faculty of Mathematics and Computer Science.

data-analysis data-visualization ecommerce microsoft-sql-server powerbi python

Last synced: 18 May 2026

https://github.com/leosimoes/digitalinnovationone-analise-covid

Projeto prático "Criando modelos com Python e Machine Learning para prever a evolução do COVID-19 no Brasil" da Digital Innovation One.

arima-models data-analysis data-science python time-series

Last synced: 09 May 2026

https://github.com/zulhaditya/web-scraping-python

A repository that stores various source code and web scraping methods using Python.

data-analysis python3 webscraping

Last synced: 12 Oct 2025

https://github.com/chirlmin-joo-lab/papylio

Single-molecule fluorescence trace extraction and analysis

biophysics data-analysis fluorescence fret single-molecule sparxs

Last synced: 12 Oct 2025

https://github.com/jsimell/sleepanalysis

A Python data analysis project analyzing the sleep quality affecting factors and temporal patterns in the sleeping data of a single subject.

data-analysis matplotlib numpy pandas python scikit-learn seaborn

Last synced: 14 Apr 2026

https://github.com/gmalbert/supreme-court

Data Analysis of the US Supreme Court from 1790 to present

data-analysis data-science supreme-court

Last synced: 31 May 2026

https://github.com/gmalbert/rugby

Rugby Data Analysis and Sports Betting

data-analysis rugby sports-betting

Last synced: 31 May 2026

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 21 Jan 2026

https://github.com/inddrsingh/restaurant_orders_mysql

Complex SQL queries on restaurant data for better and precise insights

data-analysis insights mysql

Last synced: 28 Jan 2026

https://github.com/samkazan/business-analysis-tableau

Business Analysis on Global/Superstore data using Tableau.

analysis data-analysis tableau visualization

Last synced: 08 Feb 2026

https://github.com/anushkundu/london-housing-market-analysis

London Housing Market Analysis: An Insightful Power BI Dashboard"

data-analysis data-visualization powerbi transformation

Last synced: 27 Jan 2026

https://github.com/saisurajmatta/healthcare-data-analytics

Power BI project analyzing Emergency Department data, demonstrating skills in data transformation, DAX, and visualization. It focuses on patient flow, wait times, demographics, and satisfaction, providing actionable insights for healthcare improvement. Includes documentation, data dictionary, and code samples.

data-analysis data-modeling data-visualization dax power-bi powerbi-visuals powerquery

Last synced: 22 Jan 2026

https://github.com/chahelgupta/dep-videogames-dataset

The data extraction and processing involved thorough exploration, preprocessing, and visualization of the "Video Game Sales with Ratings" dataset.

data-analysis data-exploration data-extraction data-preparation data-preprocessing data-processing data-science data-visualization

Last synced: 15 Oct 2025

https://github.com/rohanrony19/movie-recommendation-system

This is a python project where using Pandas library we will find correlation and give the best recommendation for movies.

data-analysis deep-learning knn-algorithm numpy pandas python recommendation-system

Last synced: 14 Apr 2026

https://github.com/sanjayankur31/20181206-neurofedora

Slides for my NeuroFedora seminar at the UH Biocomputaiton group's weekly seminar

computational-neuroscience data-analysis neurofedora neuroimaging neuroscience open-science

Last synced: 19 Feb 2026

https://github.com/aishwaryahastak/ipl_analysis

Analysis of IPL dataset using PySpark

data-analysis mllib pyspark

Last synced: 16 Oct 2025

https://github.com/mattdelaune/excel_sales_dashboard

Interactive Excel Dashboard for Coffee Sales Analysis: This project leverages Excel to analyze sales data, uncover seasonal trends, regional preferences, and customer behaviors, providing actionable insights for optimizing inventory and marketing strategies.

data-analysis excel pivot-tables sales-dashboard sales-data

Last synced: 27 Jan 2026

https://github.com/mindlessmuse666/iris-ml-based-on-decision-trees

Проект демонстрирует применение моделей машинного обучения на основе деревьев решений и случайного леса для классификации набора данных Iris. Включает в себя загрузку данных, обучение моделей, оценку производительности и визуализацию результатов. Предназначен для изучения основ машинного обучения и анализа данных.

classification data-analysis data-visualization decision-trees iris-dataset machine-learning model-evaluation python random-forest scikit-learn

Last synced: 17 Oct 2025

https://github.com/prateek5525/online-shopping-analytics-project

The Online Shopping Analytics Project analyzed product trends, and regional sales using SQL and Tableau. Insights from the Sales and Location Dashboards highlighted key trends in demographics, product popularity, and regional performance. These findings empower businesses to optimize strategies, enhance marketing, and improve inventory management.

data-analysis excel kaggle-dataset sql tableau

Last synced: 20 Feb 2026

https://github.com/casassg/ms_thesis

Social Media Analysis for Crisis Informatics in the Cloud

casassg-thesis data-analysis google-cloud kubernetes

Last synced: 19 Oct 2025

https://github.com/yulia-momotyuk/dla-data-analysis-practice

This repository contains my homework assignments completed during the "Data Analyst in IT" course at Data Loves Academy.

analytics data-analysis data-visualization excel mysql numpy pandas postgres powerbi python seaborn sql tableau

Last synced: 14 Apr 2026

https://github.com/Kaushik-Puttaswamy/Airline-Passenger-Referral-Prediction-Using-Machine-Learning

This project uses a machine learning model to predict if passengers referred by existing customers will book a flight, helping airlines target likely customers. Key factors like service ratings and value for money drive predictions, achieving over 90% accuracy.

airline-marketing customer-referral-prediction customer-satisfaction data-analysis feature-engineering hyperparameter-tuning machine-learning model-evaluation predictive-analytics

Last synced: 20 Oct 2025

https://github.com/mothraa/etl-marketanalysis-webscraping-poo

OC project 2 refactoring (POO version not yet completed)

data-analysis etl poo python web-scraping

Last synced: 20 Oct 2025

https://github.com/saisurajmatta/nashville-housing-data-cleaning-project

Clean and standardize Nashville Housing dataset using SQL queries for improved data quality and structure.

azure-data-studio data-analysis mssql mysql sql sql-data-cleaning sql-queries sql-server-management-studio

Last synced: 23 Jan 2026

https://github.com/scbirlab/hts-tools

🏮 Parsing and analysing platereader absorbance and fluorescence data.

assay-analysis data-analysis fluorescence high-throughput high-throughput-screening platereader

Last synced: 23 Jan 2026

https://github.com/dcs-training/spatial_dynamics

Use of QGIS and R to analyse first and second order geospatial effects. Go to the Readme file

data-analysis geographical-data gis qgis r statistics

Last synced: 23 Oct 2025

https://github.com/changyeop-yang/study-datasciencefoundation

Big Data Science and its Analytics plays a major role in this decade. How to clean and prepare your data for analysis is still a challenge, like How to perform basic visualization of your data, How to model your data, How to curve-fit your data, And finally, how to present your findings and wow the audience

data-analysis ios kyungpook-national-university swift

Last synced: 23 Oct 2025

https://github.com/browndwarf/contracosta

Wavelength dependent starspot contrast with Kepler/K2 and TESS

astronomy data-analysis

Last synced: 23 Jan 2026

https://github.com/sugumarsrinivasan/sql-datawarehouse-project

Building Mordern datawarehouse with SQL Server, including ETL Processes, data modeling, and data analytics.

data-analysis data-analytics data-engineering data-lake data-science data-warehouse datawarehousing etl etl-pipeline medallion-architecture sql sql-query sql-server

Last synced: 19 Jun 2026

https://github.com/brianlesko/r_data_science_stat5730

Written by Brian Lesko, the repository contains R Scripts demonstrating data science topics largely originating from study at Ohio State. Contents are written in R studio using the R markdown file. As of 1/21/23 Future projects concerning data science, statistics, and machine learning will be in python in my machine learning Repository

data data-analysis flight-data ggplot2 olympics-data r-markdown tidyverse

Last synced: 23 Jan 2026

https://github.com/sehgal-vishal/sql-nyc-collision-analysis

this analysis is based on the Collisions(Accidents) happend in New York City. I have used Sql Server For EDA(Exploratory Data Analysis

data-analysis database eda sql-server

Last synced: 06 Feb 2026

https://github.com/ljadhav25/linear_regression_data_science

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is called the independent variable.

data-analysis data-science linear-regression machine-learning

Last synced: 26 Oct 2025

https://github.com/madhursinghbhadoriya/data_analysis_fifa-players

• Using NumPy, Matplotlib, Pandas, etc processed important Information and Characteristic traits on Jupyter Notebook.

analysis data-analysis data-science graphs jupyter-notebook pandas python

Last synced: 07 May 2026

https://github.com/srinibas-masanta/infosys-springboard-internship

An interactive Power BI dashboard developed during my Infosys Springboard Internship to visualize Indian election trends. It integrates historical and live API data to analyze vote shares, turnout patterns, and demographic insights across constituencies, helping news agencies report results in real time.

dashboard data-analysis data-cleaning data-collection data-visualization dax-functions powerbi

Last synced: 25 Feb 2026

https://github.com/aakk23/professional-survey-powerbi

This Power BI dashboard analyzes survey data from data professionals, highlighting salary trends, job roles, and career satisfaction. It provides insights into work-life balance, programming language preferences, and industry demographics.

data-analysis data-visualization dax excel powerbi powerquery

Last synced: 23 Feb 2026

https://github.com/vishalsiingh/deloitte-virtual-internship

Submission for the STEM Virtual Program by Deloitte via Forage.

coding cyber-security data-analysis deloitte development forage forensics

Last synced: 23 Jan 2026

https://github.com/limatix/limatix

Limatix datacollect and processtrak tools

data-analysis python scientific-workflows

Last synced: 23 Jan 2026

https://github.com/code-jl/nfl-kicker-predictor

A sophisticated Python application that provides real-time NFL kicker statistics and performance analysis with an intuitive graphical interface.

beautifulsoup data-analysis data-visualization espn football gui nfl prediction python real-time-analytics real-time-data sport-analytics sports-data statistics tkinter web-scraping

Last synced: 01 Jun 2026

https://github.com/gaurabkundu1/road-accident-data-analysis

This is an Excel project on Road Accident Data Analysis in the form of an interactive Dashboard.

dashboard data-analysis data-vizualisation excel road-accidents

Last synced: 24 Jan 2026

https://github.com/garcane/unicorn-companies-analysis

Tracking unicorn startups (valued at $1B+) provides valuable insights for investors and analysts to identify high-growth industries and emerging trends.

data-analysis exploratory-data-analysis financial-analysis investor postgresql sql

Last synced: 24 Jan 2026

https://github.com/diegopino/publibdata_codexhackathon

Public Library Data processing/analysis codex hackathon attempt

data-analysis data-visualization libraries public

Last synced: 24 Jan 2026

https://github.com/valentinoli/swiss-foodprint

Project in Applied Data Analysis, EPFL 2019

carbon-emissions data-analysis diet foodprint swiss switzerland

Last synced: 24 Jan 2026

https://github.com/rahulchouhan1/sql-data-warehouse-project

Building a modern data warehouse with SQL Server, including ETL Processes, data modeling, and analytics.

data-analysis data-cleaning data-engineering data-science data-warehouse datascience etl etl-pipeline sql sql-query sql-server

Last synced: 24 Jan 2026

https://github.com/leftcoastnerdgirl/excel_crowdfunding_analysis

This project demonstrates the use of MS Excel for data cleansing & formatting to prepare for data analysis and visualization.

bar-charts conditional-formatting data-analysis data-analytics data-analytics-excel data-preparation data-preprocessing data-visualization excel line-graph

Last synced: 06 Feb 2026

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 16 Mar 2026

https://github.com/annnieglez/fraud-detection-eda

Fraud Detection - Exploratory Data Analysis (EDA). Analyzing financial transactions to detect fraud patterns using Python and Tableau. Libraries: Pandas, Seaborn and Matplotlib. Key Focus: Data cleaning, fraud trends, high-risk transactions, time-based patterns

data-analysis data-science data-visualization eda fraud-detection fraud-prevention matplotlib seaborn

Last synced: 28 Jan 2026

https://github.com/srimantapal205/dataengineerwireframedesigns

Data Engineer Wireframe Designs are essential for planning and visualizing data pipelines, architecture, and workflows before implementation.

data-analysis data-engineering dataflow dataflow-programming datapipeline dataprocessing development visualization

Last synced: 29 Jan 2026

https://github.com/engineertolulope/us_states_living_ranking_analysis

Python script for analyzing and ranking U.S. states based on factors like cost of living, tax burden, diversity, crime rates, and climate. Uses weighted criteria to identify the best states to live in according to these metrics. Ideal for decision-making on relocation.

data-analysis data-science linear-regression machine-learning python scikit-learn

Last synced: 29 Jan 2026

https://github.com/wareflowx/excel-toolkit

A powerful command-line toolkit for Excel and CSV data manipulation, analysis, and transformation.

data-analysis data-wrangling excel pandas python uv

Last synced: 29 Jan 2026

https://github.com/smahala02/magnetism-lab

This repository contains Python scripts and data for analyzing inductance in toroidal coils to calculate the magnetic permeability of ferrite materials. The project helps classify materials as soft or hard magnets based on experimental data.

data-analysis inductance jupyter-notebook magnetism python toroids

Last synced: 29 Jan 2026

https://github.com/shrutiijoshi/marketing-campaign-report

The dataset includes information on campaign types, recipient segments, interactions (clicks, opens, bounces, etc.), and conversion metrics.

dashboard data-analysis data-visualization tableau-public

Last synced: 25 Feb 2026

https://github.com/joannescode/regex_with_py

Learning by practicing with Regex (Python)

data-analysis python3 regex

Last synced: 30 Jan 2026

https://github.com/surajwate/datalab

DataLab is a versatile toolkit designed to simplify data exploration, analysis, and visualization for data scientists.

data-analysis data-science python visualization

Last synced: 30 Jan 2026

https://github.com/ljadhav25/decision-tree-random-forest-algorithm-data-science-

This repository contains an implementation of decision tree and random forest algorithms from scratch in Python. Decision trees and random forests are popular machine learning algorithms used for classification and regression tasks. The goal of this project is to provide a clear and understandable implementation of these algorithms

data-analysis data-science decision-trees machine-learning-algorithms matplotlib numpy pandas python random-forest-classifier

Last synced: 15 Apr 2026

https://github.com/manishabarse/hr_data_analysis

Used Microsoft SQL Server Management Studio and Power BI

data-analysis powerbi sql ssms

Last synced: 30 Jan 2026