An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/johannaschmidle/amazon-cat-couch

Customer product reviews + ratings analysis and visualization [Python, Excel, Tableau, R]

data-analysis data-visualization jupyter-notebook python-notebook r-markdown sentiment-analysis text-analysis web-scraping

Last synced: 11 Jun 2026

https://github.com/shubhamgoyal575/tableau-visualization-dashboard

This repository features interactive Tableau dashboards for sales performance and healthcare analysis. It includes insights on revenue trends, regional sales, patient demographics, and hospital occupancy for data-driven decision-making. 🚀

dashborad data-analysis data-cleaning-and-preprocessing healthcare-analysis healthcare-dashboard sales-dashboard sales-data-analysis-project tableau tableau-dashboards tableau-public visualization visualization-tools

Last synced: 20 Feb 2026

https://github.com/applicativesystem/numpy-builder

code getter and numpty operator for numpy operations

data-analysis numpy numpy-python shell-script

Last synced: 15 Aug 2025

https://github.com/emmarhoffmann/analysis-of-sleep-patterns-and-psychological-well-being-among-college-students

Explores the relationship between sleep patterns, psychological well-being, and lifestyle choices among college students using statistical analysis on 253 observations.

college-students data-analysis r statistical-models

Last synced: 04 Oct 2025

https://github.com/analysisbyvivek/Crime-data

Analyzes crime patterns across different areas, exploring factors such as crime type, weapon usage, demographic influences, and geographic distribution to uncover trends in frequency, correlations, and hotspots.

apache-superset data-analysis eda jupyter-notebook python

Last synced: 29 Jan 2026

https://github.com/borjamome/accidentes_madrid

Análisis de Accidentes en Madrid en SQL (2023)

accidentes-coche data-analysis madrid sql

Last synced: 17 Jan 2026

https://github.com/amoghkori/working-with-apache-spark-mllib

Implemented Apache Spark MLLib to analyze a large car dataset, predict car selling prices, and gain insights into the car market.

amazon-web-services data-analysis data-visualization exploratory-data-analysis linear-regression machine-learning model-selection pyspark python random-forest sagemaker spark

Last synced: 13 Apr 2026

https://github.com/nature40/casestudies

Case studies for testing the functionality of database systems, sensors, etc

casestudies data-analysis data-visualization database

Last synced: 02 May 2026

https://github.com/ireneflorez/nypd-mvc

Analysis of NYPD Motor Vehicle Collisions

basemap data-analysis folium jupyter-notebook matplot pandas python

Last synced: 08 May 2026

https://github.com/baguilar6174/python-jupyter-notebooks

Explore data analysis projects with Python, Jupyter and more tools. Discover stunning visualizations and reveal meaningful information in datasets to make informed decisions.

data-analysis jupyter-notebook kaggle pandas python

Last synced: 09 Apr 2026

https://github.com/natgluons/fmcg-data-modeling

SQL, ARIMA, and K-Means Clustering for data analysis dan customer segmentation regarding sales data

arima-forecasting arima-model customer-segmentation data-analysis data-science-projects kmeans-clustering sales-forecasting

Last synced: 13 Aug 2025

https://github.com/jayavarshini-jayakumaran/nba-exploratory-data-analysis

A data analytics project that explores NBA game and player data using Python and Power BI. Features data preprocessing, EDA, feature engineering, and an interactive dashboard for visualizing team and player performance trends.

data-analysis data-visualization exploratory-data-analysis powerbi python3

Last synced: 20 Jun 2026

https://github.com/extwiii/datascience-jhu

Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera

biostatistics data-analysis data-science linear-regression multivariate-regression r r-programming toolbox visualization

Last synced: 05 Jul 2025

https://github.com/jameswrigley/laph

A node-based data analysis program.

cpp data-analysis nodes qml

Last synced: 05 Jun 2026

https://github.com/farhad-here/adventureworks_interactive_sales_dashboard_powerbi

An interactive Power BI dashboard for Adventure Works sales team to analyze performance, customers, products, and employees. Includes data cleaning, data modeling, DAX measures and advanced visualization features.

business-intelligence chart csv data-analysis data-cleaning data-cleaning-and-preprocessing data-visualization dax powerbi

Last synced: 13 Aug 2025

https://github.com/crazy-dot/instagram_user_analytics

Analysis of Popular Social Media Network - Instagram

data-analysis instagram-analytics project-repository trainity

Last synced: 07 Jan 2026

https://github.com/arun-data-analyst/finance-reporting-sql

End-to-end SQL project for project/portfolio finance: schema, seed data, validation, data-quality checks, business queries, and KPI views (Power BI–ready).

data-analysis data-modeling data-quality database finance kpi portfolio-management powerbi sql sql-server ssms

Last synced: 18 May 2026

https://github.com/ray-chew/pycsam

pyCSAM is a robust approach for approximating geodesic subgrid-scale orographic spectra with applications to weather forecasting and broader data analysis

data-analysis gmted icon-model merit-dem orographic spectral-analysis topography weather-forecast

Last synced: 28 Feb 2025

https://github.com/aaisha-nexus/sql_company_insights

A beginner-friendly SQL project for managing employee records, departments, and sales transactions. Includes table creation, optimized queries, stored procedures, and window functions to extract business insights.

business-analytics data data-analysis dataanalysis-projects dataanalytics database-schema mssql-database query relational-databases sql sql-query ssms

Last synced: 12 Aug 2025

https://github.com/busradeveci/student-performance-prediction

A machine learning project to predict student exam performance based on academic, social, and personal features. Built with Python and scikit-learn.

data-analysis kaggle linear-regression machine-learning predictive-modeling python scikit-learn student-performance

Last synced: 25 Apr 2025

https://github.com/r12habh/canada-imigration-data-analysis

Dataset: Immigration to Canada from 1980 to 2013 - International migration flows to and from selected countries - The 2015 revision from United Nation's website. (Cognitive Class Data Analysis with Python)

canada data-analysis data-science data-visualization datascience python python3

Last synced: 23 May 2026

https://github.com/nmelgar/lego_my_data

Data visualization project to sell LEGO bulks.

csv data-analysis data-visualization data-viz google-sheets tableau

Last synced: 08 Jan 2026

https://github.com/gintuvedula/crime-data-analysis-with-mysql-and-python

This project aims to analyze crime data using MySQL for database management and Python for data analysis and visualization. The objective is to uncover crime trends, hotspots, and patterns to support law enforcement and urban planning efforts.

data-analysis data-exploration database mysql python

Last synced: 05 May 2026

https://github.com/mxagar/data_science_udacity

My personal notes, code and projects of the Udacity Data Science Nanodegree.

dashboard data-analysis data-engineering data-science machine-learning-pipelines

Last synced: 09 Apr 2025

https://github.com/hemangsharma/streamingcontentanalyzer

This Streamlit application provides an interactive dashboard for analyzing streaming content data. It allows users to explore movie and TV show ratings, distributions, temporal trends, and genre breakdowns through various visualizations and filters.

dashboard data-analysis data-science data-visualization python streamlit-dashboard streamlit-webapp

Last synced: 02 Apr 2025

https://github.com/mishaa931/amazon-sales-dashboard-power-bi

This project features a dynamic Power BI dashboard built on dummy Amazon sales data. It visualizes key business metrics such as revenue trends, top-selling categories, discount impact, and geographic performance. The dashboard is designed to help stakeholders make data-driven decisions through clear, interactive visuals.

data-analysis data-quality data-visualization microsoftpowerbi

Last synced: 05 Feb 2026

https://github.com/anjasfedo/data-analysis

Repo to Explore Data Analysis

data-analysis numpy

Last synced: 13 Apr 2026

https://github.com/marianamartiyns/inep-educationperfomance

Data collection, processing, exploratory analysis, and predictive modeling of school performance rates using datasets from INEP.

data-analysis data-cleaning data-science inep predictive-modeling pyhton web-scraping

Last synced: 16 Mar 2025

https://github.com/mindlessmuse666/eda-pandas

Проект по разведочному анализу данных (EDA) о пассажирах Титаника с использованием библиотеки Pandas. Включает в себя загрузку данных, предобработку, статистический анализ, визуализацию и создание сводных таблиц. Цель проекта - демонстрация основных методов и инструментов EDA для анализа и понимания данных.

data-analysis data-processing data-science data-visualization eda exploratory-data-analysis matplotlib pandas python titanic

Last synced: 18 Apr 2026

https://github.com/erayagdogan/simplecharts

Simple Charts is a chart maker compose app with material 3 design. Charts are created using the lets-plot-compose library.

android android-app charts data-analysis data-visualization jetpack-compose lets-plot-kotlin material-3 viewmodel

Last synced: 11 Aug 2025

https://github.com/fisseha-estifanos/telecom

A showcase repository for a specific telecommunication company. Used to analyze several telecommunication data set features and generate useful insights accordingly. Insights generated could be seen at https://github.com/Fisseha-Estifanos/telecom-visualizer or at https://fisseha-estifanos-telecom-visualizer-home-huxgy0.streamlitapp.com/

data-analysis notebooks-jupyter python visual-studio-code visualization

Last synced: 12 May 2026

https://github.com/roland045/smart_fluid_sedimentation_tester

Control program for custom developed smart fluid sedimentation tester system

arduino data-analysis instrumentation measurement sensor

Last synced: 13 May 2026

https://github.com/gutow/langmuir_trough

Code to run homebuilt Langmuir Trough using Jupyter and Python. Link below for API docs:

data-acquisition data-analysis jupyter langmuir-trough plotting

Last synced: 11 Aug 2025

https://github.com/khushi-sabarad/8-week-sql-challenge

Case studies' solutions for the #8WeekSQLChallenge by Danny Ma

8weeksqlchallenge case-study data-analysis mysql sql

Last synced: 06 Sep 2025

https://github.com/shellynagar27/good-cabs-data-analysis-project

This project is part of CodeBasics Challenge #13, where the goal was to provide actionable insights to the Chief of Operations at Goodcabs, a cab service provider in tier-2 cities of India. The project focused on analyzing key metrics like trip volume, repeat passenger rate, and passenger satisfaction.

critical-thinking data-analysis data-visualization excel exploratory-data-analysis power-bi presentation problem-solving sql storytelling

Last synced: 25 Jan 2026

https://github.com/tj2904/lfb-callout-analysis

An investigation into London Fire Brigade's callout data.

data-analysis decsion-tree kmeans lfb-incidents london-fire-brigade pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/shellynagar27/business-insights-360-project

A comprehensive Dashboard which provides better understanding of the business's market standing, key focus areas for optimization, underperforming customers, and year-wise financial insights, aiding in better inventory planning and performance tracking. Further it can be used in answering n number of why questions based on the situations.

dashboard data-analysis data-visualization dax-languague dax-studio excel performance-optimization power-bi reporting sql storage-manager

Last synced: 27 Jan 2026

https://github.com/alanjamlu34/bike-dataset

Ini adalah tugas akhir dari kelas Dicoding Menjadi Data Analist

data-analysis streamlit-dashboard

Last synced: 19 Oct 2025

https://github.com/hemangsharma/hotel-revenue-booking-analysis

This project provides a comprehensive revenue and reservation analysis for Highfield Hotel using historical data exported from booking systems and internal revenue reports. The goal is to derive actionable insights to improve room profitability, understand booking patterns, and support data-driven decision-making.

analysis data-analysis data-visualization hotel

Last synced: 10 Aug 2025

https://github.com/alan-oliveir/state-of-data-2022

Neste projeto faço a análise da distribuição das faixas salariais para os profissionais de nível júnior para o cargo de analista, cientista e engenheiro de dados.

data-analysis jupyter-notebook pandas-python seaborn-python

Last synced: 03 Oct 2025

https://github.com/emaleckova/emaleckova.github.io

My personal website created with Quarto

biology data-analysis data-viz quarto r

Last synced: 25 Mar 2025

https://github.com/ravi-prakash1907/covid-19-china

A data-science research work to understand the growth rate of the novel Coronavirus.

china coronavirus covid-19 data-analysis data-mining data-science mathematical-modelling project r research research-paper

Last synced: 06 Sep 2025

https://github.com/paul0vinicius/ad2

Repositório da disciplina de Análise de Dados 2 (Data Analysis II)

data-analysis data-science

Last synced: 08 Jan 2026

https://github.com/chiragkumargohil/co2-emissions-data-analysis

A Python programme that analyses CO2 emission data from 1997 to 2010. This programme prints data, provides brief of a given year, displays and compares Year vs. Emission graphs for chosen countries, and generates a separate data file for chosen countries. It was a self-paced project that Guru 99 provided.

co2-emission data-analysis matplotlib python

Last synced: 28 Aug 2025

https://github.com/blackcub3s/msc-finalthesis

The most important programming files, code functions and data processing pipelines for the Machine learning final thesis of my Master's degree. Also, the LaTeX code of the thesis.

data-analysis latex machine-learning numpy python sklearn

Last synced: 09 Apr 2026

https://github.com/devanshsahu47/prime-content-analytics

Prime Data Explorer analyzes Amazon Prime's content and credits data to uncover trends in release years, genres, and ratings. It cleans, merges, and visualizes the data to provide actionable insights for optimizing content strategy and boosting audience engagement.

data-analysis data-visualization exploratory-data-analysis jupyter-notebook python3

Last synced: 13 May 2026

https://github.com/yash22222/data-analysis-on-real-time-social-media-comments

EngageInsight analyzes user interactions in comment data. It provides insights through visualizations created using Python libraries like Pandas and Matplotlib. The project aims to uncover patterns and trends in user engagement. The visualizations provide an overview of comment lengths, the frequency of different types of replies.

data-analysis data-cleaning-and-preprocessing data-visualization matplotlib pandas pattern-recognition real-time-social-media-data seaborn trend-analysis

Last synced: 14 May 2026

https://github.com/tatilimongi/first_python_project

Este repositório contém um estudo de caso de automação de planilhas em Python para análise de vendas de carros por fabricante ao longo dos anos

data-analysis email-sending file-manipulation graphical-visualization spreadsheet-automation

Last synced: 26 Mar 2025

https://github.com/fbarffmann/vba-challenge

Built an Excel VBA script to automate stock market analysis across multiple years. Programmatically calculated and visualized key financial metrics, reducing manual reporting time and improving data accuracy.

automation data-analysis excel excel-vba financial-analysis reporting stock-market vba

Last synced: 04 Feb 2026

https://github.com/vatshayan/pokemon-analysis

Visualization, Analysis & Predicting the accuracy of finding Pokemon power, attack & speed through Machine Learning

artificial-intelligence data data-analysis data-science data-visualization dataset machine-learning machine-learning-algorithms pokemon scikit-learn

Last synced: 30 May 2026

https://github.com/isaacmaffeis/imad-2023

Model Identification and Data Analysis (IMAD) | University course

data data-analysis data-science model model-identification

Last synced: 09 May 2026

https://github.com/marielachirinosr/pandas-weather-project

Pandas Weather Data. Explore straightforward Python scripts for weather information analysis.

data-analysis pandas python

Last synced: 29 Apr 2026

https://github.com/abhisek-13/whatsapp-chat-analyzer

The WhatsApp Chat Analyzer is a data analysis project that provides insights into WhatsApp chats. It analyzes chat data to show metrics like the number of lines, most used letter, chatting duration, media files shared, most used emojis, and group member activity. The results are displayed on a user-friendly dashboard built with Streamlit.

data-analysis data-mining data-visualization eda machine-learning machine-learning-algorithms matplotlib numpy pandas python seaborn sklearn

Last synced: 13 Apr 2026

https://github.com/busradeveci/odev2-branching

This project is prepared for Artificial Intelligence and Technology Academy Git GitHub Assignment 2. Using the “Wine Reviews” dataset from Kaggle, it converts wine ratings into star ratings and analyzes them.

data-analysis kaggle-dataset python wine-reviews-dataset

Last synced: 03 Oct 2025

https://github.com/isaqueiros/newspapersales-predictions-linearregression_and_regularisation

This notebook is a study on the sales of newspapers of a local stand, with intention to predict the newspaper sales performance based on the different features available. For this, 4 sklearn models are applied: Linear Regression, Lasso Regression, Ridge Regression and Elastic Net Regression.

data-analysis data-science linear-regression machine-learning python regularization-methods sklearn-library sklearn-linear-regression

Last synced: 02 May 2026

https://github.com/lucalullo/monitoring-healthcare-waiting-times-puglia

Monitoring and analysis of public healthcare waiting times in Puglia (Italy), 2024 — based on official open data

data-analysis healthcare italy jupyter-notebook kaggle open-data pandas public-data puglia time-series waiting-times

Last synced: 08 Jan 2026

https://github.com/akunna1/energy-data-analysis-unc-campus

Link to Report: https://adminliveunc-my.sharepoint.com/:w:/r/personal/tadennis_ad_unc_edu/Documents/Capstone%20Group/Final%20Report%20Draft.docx?d=wba9e7182a9b948898133e4f89def1d90&csf=1&web=1&e=fQGAfy

arcgis-pro data-analysis dplyr excel geospatial-data-analysis ggplot ggplot2 lubricants tidyr tidyverse

Last synced: 08 Aug 2025

https://github.com/lucs1590/agidatatest

This is a repository with data analysis and data science tests.

data-analysis data-science python test

Last synced: 13 May 2026

https://github.com/devexpress-examples/winforms-pivot-change-summarydisplaytype-in-context-menu

The following example shows how to customize the field header's context menu in the PivotGridControl.PopupMenuShowing event handler. The event allows you to change the SummaryDisplayType value of the field.

data-analysis dotnet pivot-grid pivot-grid-for-winforms winforms xtrapivotgrid-suite

Last synced: 06 May 2026

https://github.com/neuralsignal/loris

Loris: Database and Analysis application for a Drosophila Lab (or any lab)

data-analysis data-structures database datajoint flask neuroscience

Last synced: 12 Mar 2026

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/gmasson/datadash

DataDash é uma biblioteca JavaScript e CSS para criar dashboards interativos, para visualização de dados dinâmicos em páginas web.

dashboard dashboard-application dashboards data-analysis data-science data-visualization javascript

Last synced: 08 Aug 2025

https://github.com/beyzabasarir/brazilian-e-commerce-analysis

Brazilian E-Commerce Dataset By Olist PostgreSQL Analysis

data-analysis data-visualization sql

Last synced: 08 Jan 2026

https://github.com/deypadma2020/sql_project

✏️ A collection of practical SQL case studies and solutions exploring real-world business scenarios: car showroom analysis, esports tournament, customer insights, finance analysis, pricing strategy, and marketing analytics.

business-intelligence case-study data-analysis database mysql queries sql

Last synced: 30 May 2026

https://github.com/deypadma2020/dataanalysis-mlalgo

Practice repository for data analysis, feature engineering, statistics, web scraping, and building ML model pipelines in Python.

data-analysis eda feature-engineering machine-learning-algorithms ml-pipeline statistics web-scraping

Last synced: 30 May 2026

https://github.com/namratagulati/tweets_analysis

This repository focuses on sentiment analysis of Twitter data using Python, Natural Language Processing (NLP), and the Natural Language Toolkit (NLTK). The goal is to extract valuable insights from social media discussions, such as word frequency, hashtag trends, and sentiment patterns.

analysis data-analysis natural-language-processing nlp-machine-learning nltk-corpus nltk-python sentiment-analysis twitter-sentiment-analysis

Last synced: 07 Aug 2025

https://github.com/codeslash21/wrangle-twitter-archive

Wrangle Twitter Archive WeRateDog. WeRateDog has 8M followers and they rate the dogs with funny comments and unique rating system. Also use dog-breed classifier to predict dog's breed in the tweets.

data-analysis data-wrangling neural-networkt twitter-api twitter-archive

Last synced: 10 Apr 2025

https://github.com/codeslash21/wrangle_twitter_archive

Wrangle Twitter Archive WeRateDog. WeRateDog has 8M followers and they rate the dogs with funny comments and unique rating system. Also use dog-breed classifier to predict dog's breed in the tweets.

data-analysis data-wrangling nanodegree-project neural-network twitter-api twitter-archive

Last synced: 10 Apr 2025

https://github.com/samruddhi3012/public-health-data-analysis

Hi! This repo involves analyzing the Healthcare analytics using Advanced Microsoft Excel.

dashboard data-analysis data-visualization healthcare microsoft-excel pivot-chart pivot-tables vlookup

Last synced: 05 Feb 2026

https://github.com/ankitpoddar07/sqlpizzas-saleproject

🍕 Pizza Sales Analysis with SQL

data-analysis database excel mysql powerbi ppt python

Last synced: 09 May 2026

https://github.com/samwhaaa/da_portfolio

Showcasing some of my Data Analytics projects

data-analysis data-analytics data-visualization jupyter jupyter-notebook python

Last synced: 01 Mar 2025

https://github.com/robinmillford/cardiac-care-performance-dashboard

This project presents a comprehensive data analysis and interactive dashboard focused on Cardiac Surgery and Percutaneous Coronary Interventions (PCI) performance by hospital, spanning from 2008 onwards.

cardiac data-analysis data-visualization plotly-express streamlit-dashboard tableau tableau-public

Last synced: 07 Sep 2025

https://github.com/as16082023/motor-vehicle-thefts

Using SQL to analyze vehicle theft patterns across New Zealand, focusing on trends related to specific times and locations.

data-analysis mysql sql

Last synced: 10 Apr 2025

https://github.com/firetyrant/sql-portfolio-projects

Documenting my SQL learning journey with hands-on projects focused on data cleaning, analysis, and optimization.

bigquery data-analysis databases etl learning portfolio query-optimization sql

Last synced: 19 Apr 2026

https://github.com/rainbowatcher/simple

Make data work easier, saving your working time

bigdata data-analysis etl

Last synced: 10 Apr 2025

https://github.com/scailfin/rob-client

Command line user interface for the Reproducible Open Benchmarks for Data Analysis Platform (ROB)

benchmarks data-analysis reproducibility

Last synced: 14 Jan 2026

https://github.com/PanosChatzi/Healthcare_and_Bioinformatics_Analyses

This repo contains the final assignments of the Data Analyst bootcamp by Workearly. Python and SQL were used to complete the assignments.

data-analysis data-cleaning data-visualisation jupyter matplotlib pandas python seaborn

Last synced: 05 Aug 2025

https://github.com/rahulsm20/car-data

A data analytics project that involves analyzing a car dataset that includes information on various car brands, years, prices, mileage, and fuel types, in order to gain insights into the car market.

data-analysis data-analytics matplotlib numpy pandas python

Last synced: 09 Apr 2026

https://github.com/chen0040/spark-tabular-analytics

Spark statistical inference framework for performing column pair-wise data analytics for large data table

anova chi-square-test confidence-intervals data-analysis hypothesis-testing spark statistical-inference tabular-data

Last synced: 07 Jul 2025

https://github.com/mhkamel/ecommerce-targeting-system

A Flask-based E-Commerce Targeting System that provides customer segmentation and personalized product recommendations. Users can upload structured interaction data for analysis, receive AI-driven recommendations, and gain insights into user behavior. The application is built with Flask, Pandas, Scikit-Learn, and integrates an interactive web inter

ai bootstrap csv-processing customer-segmentation data-analysis data-science e-commerce flask machine-learning pandas python recommendation-system scikit-learn user-behavior web-application

Last synced: 09 Apr 2026

https://github.com/abhipatel35/svm-hyperparameter-optimization-for-breast-cancer

Utilizing SVM for breast cancer classification, this project compares model performance before and after hyperparameter tuning using GridSearchCV. Evaluation metrics like classification report showcase the effectiveness of the optimized model.

breast-cancer cancer-diagnosis classification data-analysis data-science gridsearchcv healthcare hyperparameter-tuning jupyter-notebook machine-learning medical-imaging pycharm python scikit-learn support-vector-machine svm

Last synced: 05 Feb 2026