An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/soyuid/bakery-data-analyst

# About the Project This Bakery Data Analysis project was created to help bakery owners understand their sales patterns. With in-depth data analysis, it is expected to provide useful insights to improve sales and operational strategies.

bakery data-analysis python sales visualization

Last synced: 24 Mar 2025

https://github.com/wardenkenny/data-analyst-portfolio

A repository I have created to show and explore data analytics.

data-analysis excel r spreadsheets sql tableau

Last synced: 02 Apr 2025

https://github.com/gaboelc/analysis-of-the-employment-situation-in-costa-rica-2018-2022

This is an analysis with data extracted from the INEC in order to identify the changes that occurred in the Costa Rican labor market before, during and after the COVID-19 pandemic.

costa-rica data-analysis empleo employment

Last synced: 24 Mar 2025

https://github.com/parthds02/pizza_sales_sql

SQL project analyzing pizza sales data. Includes creating tables, executing queries, and solving basic to advanced analytical questions to derive insights from sales data.

analytics data-analysis data-science pizza-sales sql sql-query

Last synced: 04 Mar 2026

https://github.com/ashwin331133/hospital_allpatients_waitinglist_data

This Power BI project analyzes patient waiting lists across various medical specialties and case types (Day Case, Inpatient, Outpatient). The goal is to gain insights to improve healthcare management and resource allocation.

data-analysis data-visualization powerbi

Last synced: 03 Jan 2026

https://github.com/shellynagar27/transportation-and-logistics-challenge

Analyzing logistics data to optimize shipment efficiency, reduce delays, and enhance supply chain visibility using Power BI. Insights include top routes, delays, supplier trends, and peak shipments.

cleaning-data critical-thinking data-analysis data-visualization exploratory-data-analysis feature-engineering powerbi preprocessing-data problem-solving python

Last synced: 16 May 2026

https://github.com/filip-kustura/python-covid-19-behaviors-analysis

Using Jupyter Notebook, this university project analyzes attitudes and behaviors related to the COVID-19 pandemic using a two-year survey from Imperial College London and YouGov research company. Utilizing Pandas, NumPy and Matplotlib, the data analysis focuses on three countries, exploring trends and insights throughout the pandemic.

covid-19 data-analysis data-visualization jupyter-notebook matplotlib numpy pandas python university-project

Last synced: 12 Apr 2026

https://github.com/quocduyenanhnguyen/roi-modeling-and-analysis-of-sports-dataset

In this project, you will find my ROI model for retirement savings and PowerPoint presentation of my ROI model, as well as my data analysis/visualization of Sports Ticket Sales dataset that I concluded with a PDF group written report

data-analysis data-visualization microsoft-excel rate-of-return-modeling sports-ticket-sales-dataset

Last synced: 08 Feb 2026

https://github.com/ernanej/data-science-dca0131

Files, developed throughout the 2024.1 semester of the Data Science discipline taught at the Federal University of Rio Grande do Norte by the Department of Computer Engineering and Automation (DCA). 📚

big-data data-analysis data-science ia

Last synced: 30 Mar 2025

https://github.com/hemangsharma/breast-cancer-patient-dashboard

This interactive Streamlit dashboard visualizes insights from the SEER Breast Cancer Dataset (2006-2010)

data-analysis streamlit streamlit-dashboard streamlit-webapp

Last synced: 05 May 2026

https://github.com/skivhisink/econometricnsu

Семестровый магистерский курс по эконометрике на первом курсе магистратуры экономического факультета НГУ

data-analysis econometrics economics education nsu r

Last synced: 09 Apr 2025

https://github.com/borjamome/soho_cholera

Cholera deaths in the Soho District (London)

data-analysis data-visualization london r

Last synced: 04 Sep 2025

https://github.com/kernix13/github-readme-seo-analysis

A Jupyter Notebook GitHub README and Repo SEO Analysis to determine what makes a repo rank in the SERPS

accessibility data-analysis readme seo seo-analysis

Last synced: 29 May 2026

https://github.com/ehsan-behzadi/online-retail-data-analysis-and-preprocessing

This project analyzes and preprocesses the Online Retail dataset to uncover insights into customer purchasing behaviors, sales trends, and product performance. It includes data cleaning, exploration, and visualization, with the goal of enhancing understanding of online retail dynamics.

cohort-analysis data-analysis data-cleaning data-exploration duplicate-detection exploratory-data-analysis-eda feature-encoding feature-engineering handling-missing-values online-retail outlier-detection preprocessing trends-visualization visualization z-score-method

Last synced: 16 Apr 2026

https://github.com/noorulhudaajmal/customer-segmentation-analysis

Customer segmentation and analysis of purchasing behaviour

cluster-analysis customer-segmentation data-analysis

Last synced: 07 Oct 2025

https://github.com/chinmayee4/vrinda_store_data_analysis

Analyzed Data By Creating Interactive Dashboard Using MS Excel

data-analysis data-cleaning data-visualization excel-dashboard pivot-tables power-query

Last synced: 07 Jan 2026

https://github.com/muthukumar0908/imdb_movie_analysis_with_powerbi

The project aim is to analyze the dataset using Power Bi, The dataset is related to IMDB Movies.

data-analysis data-visualization powerbi

Last synced: 12 Jun 2025

https://github.com/bkataru/physics-e.e

Project repository for IB physics extended essay. Topic: Predictive data modeling of a variable binary star’s brightness over a period of time using astrostatistics.

astrometry astronomical-algorithms astronomical-images astronomy astrophotography astrostatistics data-analysis data-science data-visualization modeling physics polynomial-regression regression-analysis

Last synced: 09 Apr 2025

https://github.com/ankitmishralive/machinelearning

Continuously deep diving in understanding & advancing my expertise in Machine Learning through ongoing education and hands on experience with practical learning.

artificial-intelligence data-analysis data-cleaning data-gathering machine-learning machinel-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 22 Mar 2025

https://github.com/faysalalmahmud/bd-med-professional-analysis

Analysis of healthcare professionals in Bangladesh through web scraping, data processing, and interactive visualization.

data-analysis data-visualization jupyter-notebook python scraper selenium selenium-webdriver tableau

Last synced: 04 Sep 2025

https://github.com/prakshal0809/power-bi-analytics-dashboard

I have developed a dashboard in Power BI utilizing data from an Excel file. The dashboard effectively visualizes and analyzes the given data.

data-analysis powerbi

Last synced: 22 Feb 2026

https://github.com/anudeepkaddala/bankds

This repository contains a Python-based solution for cleaning, matching, and formatting bank data. The primary goal is to match banks from two datasets based on their names and associate each bank with its respective asset size. The final output is a cleaned dataset with asset sizes in Indian-style currency format.

data-analysis data-science fuzzy-matching pandas python

Last synced: 12 Apr 2026

https://github.com/siddhant2105s/airman-database-system

This repository contains the design and implementation of the AirMan System for managing airport operations at London Biggin Hill Airport. It includes an ERD diagram, MySQL scripts for database creation, data insertion, and queries, as well as detailed data definitions and system requirements documentation.

data-analysis database-design database-normalization entity-relationship-diagram entity-relationship-models mysql relational-databases sql-queries

Last synced: 25 Mar 2025

https://github.com/avratanubiswas/fluorpenplugin

A matlab user interface for analysing OJIP curve datasets from FluorPen instrument. That is, serving as an additional plug in for "quick categorical analysis".

data-analysis fluorpen ojip-curve

Last synced: 18 Mar 2026

https://github.com/hazim-hf/data-science

This course covers basic data science principles, Python programming, and the concept of big data and its types. It explores algorithms, methods, and analyses in data science with practical Python examples. Additionally, it highlights current data technologies for storing and archiving.

data-analysis data-wrangling time-series

Last synced: 04 Jul 2025

https://github.com/nullthefirst/py-notebooks

Jupyter Notebooks holding Data Science projects

data-analysis data-science data-visualization datasets jupyter-notebooks python

Last synced: 26 Apr 2026

https://github.com/nurulashraf/polynomial-regression-manufacturing

A Python project implementing polynomial regression to analyse and predict manufacturing-related data. Features include data preprocessing, model training, and visualisation of results. Ideal for exploring machine learning applications in manufacturing process optimisation.

data-analysis data-visualization machine-learning manufacturing polynomial-regression predictive-modeling process-optimization python regression-models scikit-learn

Last synced: 16 Apr 2026

https://github.com/parthds02/e-commerce-data-analysis-with-python

This project focuses on analyzing an e-commerce dataset using Python. The goal is to derive meaningful insights through exploratory data analysis (EDA) and uncover trends and patterns that can drive business decisions.

data-analysis ecommerce exploratory-data-analysis jupyter-notebook pytho sales-analysis visualization

Last synced: 13 Jun 2025

https://github.com/zen204/renewable-energy-usage-v-electricity-access

Interactive data visualization project created for COSI 116A: Introduction to Information Visualization at Brandeis University (Fall 2024). The project showcases data-driven insights using advanced visualization techniques and user interactivity. Hosted on GitHub Pages.

d3js data-analysis data-visualization electricity github-pages html-css-javascript information-visualization interactive python renewable-energy tableau web-development

Last synced: 08 Feb 2026

https://github.com/sco1/xbmini-py

Python Toolkit for the GCDC HAM

data-analysis data-visualization python python3

Last synced: 07 May 2025

https://github.com/namratha2301/python-dashboard-streamlit

Experimenting with Streamlit. Streamlit app provides an interactive visualization of the best-selling books, showcasing trends, top-selling books, top authors, genre distributions, and sales by decade.

css dashboard data-analysis pandas plotly python seaborn streamlit

Last synced: 05 May 2026

https://github.com/samruddhi3012/tata-data-visualization

Hi! This repo contains the dashboard I created using Tableau for TATA Data Visualization Training!

data-analysis data-visualization tableau tata

Last synced: 07 Jan 2026

https://github.com/pinedah/sleep-data-analysis-exercise

Análisis de un dataset médico sobre el sueño, explorando duración, calidad y factores relacionados. Incluye limpieza de datos, EDA y visualizaciones con Python (pandas, numpy, matplotlib, seaborn, scipy).

data-analysis data-science escom numpy pandas python school-project scipy

Last synced: 13 Apr 2026

https://github.com/ascender1729/leetcode_scraper

Extract topic tags from LeetCode problems to streamline interview preparation.

beautifulsoup coding-interview data-analysis graphql leetcode python scraper web-scraping

Last synced: 20 Jun 2026

https://github.com/singhrdeep/croppilot

CropPilot is a lightweight, Python-based command-line tool designed to help small-scale farmers, gardeners, and students manage crop data, track profits, and explore sustainable practices. Built for usability and extensibility.

agriculture data-analysis farm-management open-source python

Last synced: 25 Apr 2025

https://github.com/kittonn/data-analysis-freecodecamp

freecodecamp - data analysis projects.

data-analysis freecodecamp

Last synced: 05 Apr 2025

https://github.com/hemangsharma/streamingcontentanalyzer

This Streamlit application provides an interactive dashboard for analyzing streaming content data. It allows users to explore movie and TV show ratings, distributions, temporal trends, and genre breakdowns through various visualizations and filters.

dashboard data-analysis data-science data-visualization python streamlit-dashboard streamlit-webapp

Last synced: 02 Apr 2025

https://github.com/mchirico/go_slicestore

Pull Data from Slice Store

data-analysis go ibm

Last synced: 16 Mar 2025

https://github.com/marianamartiyns/inep-educationperfomance

Data collection, processing, exploratory analysis, and predictive modeling of school performance rates using datasets from INEP.

data-analysis data-cleaning data-science inep predictive-modeling pyhton web-scraping

Last synced: 16 Mar 2025

https://github.com/marina-gal/elderly-care-ranking

Data analysis and scoring model for elderly care homes, including data cleaning, transformation, 0–100 scoring, and ranking across multiple quality dimensions.

data-analysis excel ranking

Last synced: 30 May 2026

https://github.com/leandrocollares/nyc-film-permits

NYC film permits: an exploratory data analysis

data-analysis data-visualization pandas plotly

Last synced: 05 Jul 2025

https://github.com/hari7261/playwithdata-python

This is one of the repository where I have put lot of data science and machine learning related questions on their solutions I hope you will find something better than some other platforms. Thank you Happy exploring

data-analysis data-science data-science-learning machienlearning matplotlib matplotlib-python ml numpy numpy-arrays numpy-library pandas pandas-dataframe pandas-library python python-script sklearn

Last synced: 13 Apr 2026

https://github.com/alanjamlu34/bike-dataset

Ini adalah tugas akhir dari kelas Dicoding Menjadi Data Analist

data-analysis streamlit-dashboard

Last synced: 19 Oct 2025

https://github.com/lopes51789/salaryanalysis

This salary dataset is a good candidate for descriptive analysis, and we can identify which demographics experience reduced or increased salaries. For example, we could explore the salary variations by gender, age, industry, and even years of prior work.

data-analysis json mysql python3 sql tableau

Last synced: 13 Apr 2026

https://github.com/mehedi-hassan81/mastercourse

Data analysis project analysing renewable energy production across 212 countries, visualizing trends with Tableau. Highlights China's dominance (2,894 TWh) and Paraguay's 100% renewable share.

data-analysis pandas python renewable-energy selenium tableau-dashboards tableau-public web-scraping

Last synced: 08 May 2026

https://github.com/rupashi03/fitbit-user-eda-case-study

Performed Exploratory Data Analysis (EDA) on Fitbit users' data to uncover trends in activity and health metrics.

business-analysis case-study consumer-insights data-analysis exploratory-data-analysis health-data r user-behavior-analytics

Last synced: 25 Mar 2025

https://github.com/ravi-prakash1907/covid-19-china

A data-science research work to understand the growth rate of the novel Coronavirus.

china coronavirus covid-19 data-analysis data-mining data-science mathematical-modelling project r research research-paper

Last synced: 06 Sep 2025

https://github.com/jooapa/bytebrother

Byte Brother is watching YOU

data data-analysis security

Last synced: 26 Jan 2026

https://github.com/marielachirinosr/pandas-weather-project

Pandas Weather Data. Explore straightforward Python scripts for weather information analysis.

data-analysis pandas python

Last synced: 29 Apr 2026

https://github.com/fer-aguirre/cookiecutter-data-analysis-lite

A cookiecutter template for data journalism projects that offers a simplified and beginner-friendly structure.

cookiecutter data-analysis data-journalism project-template python

Last synced: 14 Jun 2025

https://github.com/isaqueiros/newspapersales-predictions-linearregression_and_regularisation

This notebook is a study on the sales of newspapers of a local stand, with intention to predict the newspaper sales performance based on the different features available. For this, 4 sklearn models are applied: Linear Regression, Lasso Regression, Ridge Regression and Elastic Net Regression.

data-analysis data-science linear-regression machine-learning python regularization-methods sklearn-library sklearn-linear-regression

Last synced: 02 May 2026

https://github.com/lucalullo/monitoring-healthcare-waiting-times-puglia

Monitoring and analysis of public healthcare waiting times in Puglia (Italy), 2024 — based on official open data

data-analysis healthcare italy jupyter-notebook kaggle open-data pandas public-data puglia time-series waiting-times

Last synced: 08 Jan 2026

https://github.com/tiagocavalcante/nesfit

NES 2024 Practical and Research Work - Group 2

data-analysis fitness

Last synced: 09 Jun 2026

https://github.com/giorgossideris/athens_weather_analysis

Analyse the data of Athens' weather.

data-analysis visualization

Last synced: 16 Mar 2025

https://github.com/karishmagupta05/udemy-course-analysis

This project analyzes Udemy courses using Exploratory Data Analysis (EDA) techniques to uncover insights about course trends, pricing, subscriber counts, and popularity. By leveraging Python, Pandas, and data visualization libraries, we extract meaningful information from the dataset.

data-analysis data-visualization eda jupiter-notebook pandas python

Last synced: 13 Apr 2026

https://github.com/neuralsignal/loris

Loris: Database and Analysis application for a Drosophila Lab (or any lab)

data-analysis data-structures database datajoint flask neuroscience

Last synced: 12 Mar 2026

https://github.com/deypadma2020/dataanalysis-mlalgo

Practice repository for data analysis, feature engineering, statistics, web scraping, and building ML model pipelines in Python.

data-analysis eda feature-engineering machine-learning-algorithms ml-pipeline statistics web-scraping

Last synced: 30 May 2026

https://github.com/ankitpoddar07/sqlpizzas-saleproject

🍕 Pizza Sales Analysis with SQL

data-analysis database excel mysql powerbi ppt python

Last synced: 09 May 2026

https://github.com/satyam4229/prediction-of-cement-compressive-strength

Prediction of cement compressive strength is a model which is based on Regression model, Here we predict that how much is the compressive strength of the particular cement has with variety of mixtures of its component.

data-analysis data-science data-visualization jupyter-notebook kaggle python

Last synced: 13 Apr 2026

https://github.com/purposeachiever6/discovering_hidden_pattern

Discovering Hidden Patterns in Sequential and Numerical Data

data-analysis r statistical-analysis

Last synced: 28 Feb 2025

https://github.com/auliannee/customer-analysis-with-tableau

This repository contains the data source and the tableau workbook.

data-analysis data-visualization tableau

Last synced: 12 Mar 2026

https://github.com/aalekhpatel07/statcan

StatCAN dataset fetcher and cleaner.

census data-analysis data-science statcan

Last synced: 02 Apr 2025

https://github.com/deliprofesor/k-means-clustering-for-retail-data-analysis

This project uses K-Means clustering to segment wholesale customers based on their spending habits. The data is preprocessed, scaled, and clustered into four groups. The Elbow and Silhouette methods determine the optimal number of clusters, and results are visualized using boxplots and scatter plots to uncover spending patterns.

clustering-visualisation data-analysis elbow-method k-means k-means-clustering r silhouette-score

Last synced: 10 Apr 2025

https://github.com/codesaadumair/pandas_exercises_personal

Personalized enhancements to pandas exercises with comprehensive solutions and practical insights for mastering data analysis in Python.

data-analysis data-science pandas python

Last synced: 09 May 2026

https://github.com/jianxi-erin/bigdata-machinelearning-lab

本项目是一个综合性的大数据与机器学习实验平台,包含两个主要任务,每个任务涵盖三个关键技术模块:大数据处理、数据分析和机器学习。项目基于真实的竞赛设计,提供完整的数据处理模拟和建模实践。

data-analysis data-visualization hadoop machine-learning python spark sql

Last synced: 03 May 2026

https://github.com/vishal786-commits/target-businesscasestudy-sql

This project analyzes Target’s e-commerce transactions in Brazil between 2016 and 2018 using SQL. The goal was to explore customer behavior, order patterns, payments, delivery times, and freight costs to generate actionable business insights.

bigquery data-analysis sql

Last synced: 05 Oct 2025

https://github.com/josepablodmg/python--linear-regression---housing-exercise

A predictive analysis exploring the relationship between household characteristics and median income in California. Using linear regression, the project investigates whether blocks with fewer households correspond to higher median incomes.

california data-analysis data-science exploratory-data-analysis housing-data linear-regression machine-learning python regression scikit-learn statistics visualization

Last synced: 05 Oct 2025

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Jan 2026

https://github.com/subhamghimire/dataanavis

Learning Data analysis and visualization

data-analysis data-science data-visualization dataset

Last synced: 06 Oct 2025

https://github.com/eharshit/end-to-end-vendor-insights

End-to-end analysis of vendor performance for wholesale/retail businesses, featuring data ingestion, cleaning, insights, and interactive Power BI dashboards.

analysis analysis-algorithms analytics dashboard data data-analysis datascience jupyter jupyter-notebook pandas powerbi powerbi-report retail wholesale

Last synced: 07 Oct 2025

https://github.com/banyc/csv_logger

Long-term logger for data analysis

csv data-analysis logging

Last synced: 07 Oct 2025

https://github.com/npodlozhniy/podlozhnyy-module

One place for the most useful methods for work

data-analysis data-science pypi

Last synced: 21 Jan 2026

https://github.com/dcs-training/machinelearning

Introduction to Machine Learning with Python delivered by the centre in the year 2022-23. Go to the read me file

data-analysis data-wrangling machine-learning python statistics

Last synced: 08 Oct 2025

https://github.com/kmranrg/bikeshare

a project based on Data Analysis

data-analysis python

Last synced: 08 Oct 2025

https://github.com/dcs-training/exploratory-data-analysis-and-visualisation-with-observable-plot

This two-hour workshop will teach you how to follow an exploratory data analysis pipeline with Observable Plot, a new JavaScript library based on the Grammar of Graphics, that proposes a simple yet expressive interface to create powerful graphics easily shareable on the web. Go to the Readme file

d3 data-analysis data-visualisation javascript observable-notebook

Last synced: 17 May 2026

https://github.com/debjyotisaha/power-bi-projects-phase-2

Created interactive dashboards and reports using Power BI to visualize complex datasets. Demonstrated proficiency in data modelling, DAX calculations, and storytelling through data to provide actionable insights.

dashboards data-analysis data-modeling data-visualisation power-query powerbi

Last synced: 18 Jan 2026

https://github.com/marianamartiyns/api-logisticregression

Data analysis, modeling, and deployment of a logistic regression model for churn prediction, integrating a FastAPI backend and a Streamlit frontend.

data-analysis data-science fastapi logistic-regression pyhton streamlit

Last synced: 29 Apr 2026

https://github.com/mirwais-farahi/data-visualization-with-tableau-specialization

The Specialization provides Tableau for data visualization and business intelligence. The series covers skills like assessing data quality, designing visualizations and dashboards, and combining data sources to create compelling, data-driven stories.

dashboard data-analysis geospatial map tableau visualization

Last synced: 16 Feb 2026

https://github.com/adithya2369/safa_public

AI-powered customer feedback analyzer that uses generative AI to transform customer reviews into actionable business insights. Upload review data, get instant summaries, satisfaction scores, detailed reports, and improvement suggestions—all in an easy-to-deploy Docker container.

data-analysis data-visualization docker-containerization full-stack-development generative-ai langchain langchain-groq web-development

Last synced: 10 Oct 2025

https://github.com/ninadpatil09/hospital_emergency_room_analysis

This comprehensive analysis delves into the performance and characteristics of the hospital's emergency room over the past year. By scrutinizing key metrics and patient demographics, this study aims to provide valuable insights for optimizing patient care, resource allocation, and overall operational efficiency.

data-analysis tableau-public visualization

Last synced: 15 Feb 2026