An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/bryanfks-dev/klempoken-analysis

Analysis and forcasting model for Klempoken MSMEs

big-data-analytics data-analysis data-forecast data-visualization

Last synced: 01 Apr 2025

https://github.com/shellynagar27/transportation-and-logistics-challenge

Analyzing logistics data to optimize shipment efficiency, reduce delays, and enhance supply chain visibility using Power BI. Insights include top routes, delays, supplier trends, and peak shipments.

cleaning-data critical-thinking data-analysis data-visualization exploratory-data-analysis feature-engineering powerbi preprocessing-data problem-solving python

Last synced: 16 May 2026

https://github.com/dhanyasri20/credit-risk-prediction

Credit Risk Prediction using Python, SQL, and Flask. Trained ML models (Random Forest) to identify high-risk loan applicants with 86% accuracy, automated SQL reporting, and deployed a Flask web app for real-time predictions.

classification credit-risk data-analysis financial-data flask loan-prediction machine-learning python random-forest sql

Last synced: 28 Apr 2026

https://github.com/stoll-jonathan/sorting_algorithm_analyzer

C++ program which analyses the performance of different sorting algorithms on a dataset of random numbers

bubble-sort data-analysis insertion-sort merge-sort sorting-algorithms

Last synced: 01 Apr 2025

https://github.com/filip-kustura/python-covid-19-behaviors-analysis

Using Jupyter Notebook, this university project analyzes attitudes and behaviors related to the COVID-19 pandemic using a two-year survey from Imperial College London and YouGov research company. Utilizing Pandas, NumPy and Matplotlib, the data analysis focuses on three countries, exploring trends and insights throughout the pandemic.

covid-19 data-analysis data-visualization jupyter-notebook matplotlib numpy pandas python university-project

Last synced: 12 Apr 2026

https://github.com/dulajkavinda/pandas-exploring-data-ml

🐼 Exploring data with pandas library.

data-analysis machine-learning pandas python

Last synced: 09 May 2026

https://github.com/quantumudit/groceries-basket-analysis

This project performs market basket analysis using Power BI and Python to reveal associations between grocery items. It involves transforming raw transaction data into a processed dataset, creating interactive Power BI reports, and generating key insights through Python, enabling data-driven decision-making.

data-analysis data-visualization pandas powerbi python

Last synced: 12 Apr 2026

https://github.com/bpkaur/exploring-the-evolution-of-linux

This project explores the evolution of the Linux kernel by finding top 10 contributors and visualization of commits over the years.

data-analysis data-science datacamp ipynb-jupyter-notebook python3

Last synced: 21 Feb 2026

https://github.com/vasulab/knightshock

Shock tube experiment planning and data analysis package.

cantera data-analysis matplotlib numpy shock-tube

Last synced: 18 Jul 2025

https://github.com/yanny-alt/competitor-sales-analysis-in-power-bi

This project aims to analyze competitor sales for a fictional manufacturing company, Sintec, using Power BI. The focus is on integrating, cleaning, and modeling data from multiple sources to generate insightful reports on company and competitor performance.

data-analysis powerbi sales-analysis

Last synced: 07 Jan 2026

https://github.com/kseniatyschuk/excel-data-matcher

Compare and match Excel files via a simple Python GUI

automation data-analysis etl excel gui pandas python3 tkinter

Last synced: 23 Apr 2025

https://github.com/chinmayee4/vrinda_store_data_analysis

Analyzed Data By Creating Interactive Dashboard Using MS Excel

data-analysis data-cleaning data-visualization excel-dashboard pivot-tables power-query

Last synced: 07 Jan 2026

https://github.com/jbalooshie/election_analysis

A Python script built to analyze specific election's results, and be re-purposed to analyze the results of other elections. The script provides you with different breakdowns of the vote based on candidate and county,

data-analysis data-science elections python

Last synced: 09 Apr 2025

https://github.com/jcm-ai/quantium-data-analytics-virtual-experience-program

This repository contains all about the proposed solutions to the assignments that I was required to complete as part of the Quantium Data Analytics Virtual Experience Program. 📊📈📉👨‍💻

commercial-thinking communication-skills data-analysis data-validation data-visualisation data-wrangling jupyter-notebook matplotlib-pyplot numpy-library pandas-python presentation-skills programming python3 scipy-stats seaborn statistical-testing

Last synced: 16 May 2026

https://github.com/rohitblaze10/netflix_analysis_using_tableau

The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.

data data-analysis data-science data-visualization netflix tableau

Last synced: 04 Feb 2026

https://github.com/saigeethika05/global-connect

International Student Engagement Platform

data-analysis figma prototyping ui-design ux-design wireframes

Last synced: 04 Jul 2025

https://github.com/victorlcastro-dsa/pbl-datacamp

This repository features projects from DataCamp's Project-Based Learning (PBL) courses, showcasing practical applications of data analysis, machine learning, and visualization. Explore real-world datasets and interactive results that highlight the skills gained through hands-on learning.

data-analysis data-science data-visualization datacamp-projects hypothesis-testing machine-learning project-based-learning

Last synced: 30 Jun 2026

https://github.com/rachit1084/sql-practice-ankit-bansal

Personal SQL problem-solving practice based on Ankit Bansal's YouTube series, with logic-driven solutions for analyst prep.

analytics data-analysis data-analyst interview-preparation logical-reasoning postgresql sql sql-practice

Last synced: 04 Jul 2025

https://github.com/fbarffmann/citibike-covid-analysis

Analyzed NYC CitiBike usage during March 2020 to assess the impact of COVID-19 using Python and Tableau. Includes ridership breakdowns, user type trends, and interactive dashboard.

citibike covid19 data-analysis data-visualization exploratory-data-analysis pandas python tableau transportation

Last synced: 12 Apr 2026

https://github.com/abhash-rai/analyzing-credit-card-eligibility

This work was performed as part of BCU undergraduate course.

data-analysis data-visualization ggplot ggplot2 latex r

Last synced: 20 Jan 2026

https://github.com/soajala/shopify-sales-analysis-powerbi

End-to-end Power BI dashboard project analyzing Shopify sales data with real-time metrics, DAX, and business insights.

business-intelligence data-analysis data-visualization dax interactive-dashboard powerbi sales-analysis shopify

Last synced: 05 Sep 2025

https://github.com/wtmcgrew/sql-credit-risk-analysis

Credit Risk Analysis using SQL & Excel – Approval trends by FICO, DTI, PTI, LTV, and delinquencies.

case-study credit-risk data-analysis financial-analysis loan-applications portfolio-project sql sqlite underwriting

Last synced: 04 Jul 2025

https://github.com/smoeding/jmeterplugin-datasketches

A JMeter listener using DataSketches to estimate response time quantiles and histograms

data-analysis jmeter jmeter-listeners jmeter-plugin

Last synced: 06 Mar 2025

https://github.com/doughtnerd/pod-old

Read and write Excel data

data data-analysis excel poi-library workbook

Last synced: 21 Jan 2026

https://github.com/nsandoya/python_scrp_project

This is a tool specially made for Dipaso ecommerce website. You can extract data from there, analyze it and see keywords, brands, and categories frecuency, prices distribution and other market tendencies as well —all in a group of friendly stadistic tables and graphics (exported from a Jupyter notebook) :)

beautifulsoup4 data data-analysis jupyter-notebook pandas python3

Last synced: 28 Apr 2026

https://github.com/sumit0ubey/internship

This repository showcases the tasks and projects I completed during various internships. It includes work across diverse domains such as: Data Analysis: Exploratory data analysis, data visualization, and insights generation using Python and libraries like Pandas, Matplotlib, and Seaborn. Backend Development: Designing and implementing RESTful API

backend-development data-analysis python-developer

Last synced: 05 Sep 2025

https://github.com/fbarffmann/nosql-challenge

Analyzed 28,000+ UK restaurant records using MongoDB and PyMongo. Queried hygiene scores, location data, and customer ratings.

data-analysis data-cleaning database-analysis json mongodb nosql pymongo python restaurant-data

Last synced: 13 Apr 2026

https://github.com/junpenglao/jaefa

Just Another Eye-movement Filtering Algorithm

data-analysis eye-movement-data eye-tracking

Last synced: 12 Jan 2026

https://github.com/fbarffmann/sqlalchemy-challenge

Built a Flask API with SQLAlchemy to analyze and visualize Hawaii climate data. Automated data extraction and developed database queries for temperature and precipitation insights.

api climate-data data-analysis data-visualization flask orm python sql sqlalchemy sqlite

Last synced: 13 Apr 2026

https://github.com/nullthefirst/py-notebooks

Jupyter Notebooks holding Data Science projects

data-analysis data-science data-visualization datasets jupyter-notebooks python

Last synced: 26 Apr 2026

https://github.com/aimin-nur/visualisasi_bikestore

Data Analyst - Dashboard Bike Store

data-analysis sql visualization

Last synced: 29 Jan 2026

https://github.com/hassanislam463/data-cleaning-and-modelling-top-5-categories-analysis-forage

This project involves cleaning, merging, and analyzing datasets to identify the top 5 performing categories based on aggregate popularity scores. It includes cleaned datasets, a final merged dataset, visualizations, and a presentation summarizing the tasks and results. Tools used: Microsoft Excel, Python, and PowerPoint.

data-analysis data-visualization microsoft-excel

Last synced: 07 Jan 2026

https://github.com/zen204/renewable-energy-usage-v-electricity-access

Interactive data visualization project created for COSI 116A: Introduction to Information Visualization at Brandeis University (Fall 2024). The project showcases data-driven insights using advanced visualization techniques and user interactivity. Hosted on GitHub Pages.

d3js data-analysis data-visualization electricity github-pages html-css-javascript information-visualization interactive python renewable-energy tableau web-development

Last synced: 08 Feb 2026

https://github.com/ljadhav25/data-engineering-poc

This repository contains a beginner-level Data Engineering Proof of Concept (POC) project designed for practice. The objective is to provide hands-on experience with data engineering concepts, including data extraction, transformation, loading (ETL), and basic data analysis. This project is ideal for those looking to build foundational skills in da

data-analysis etl matplotlib numpy pandas python

Last synced: 13 Apr 2026

https://github.com/DCS-training/IntroToStatistics

This is a repository which contains all the materials to be used in the introduction to statistics course. Go to the readme file

data-analysis r rmarkdown statistics

Last synced: 25 Apr 2025

https://github.com/cezlul/analyse-ventes-immobilier

Solution ML d'analyse immobilière parisienne : classification automatique appartements vs commerces (K-means, 91%) et prédiction prix (régression, R²=0.98) sur 26K transactions. Valorise portefeuille 169M€ avec recommandations stratégiques data-driven.

data-analysis jupyter-notebook machine-learning matplotlib numpy pandas python sklearn

Last synced: 13 Apr 2026

https://github.com/evan-dg31/data-science

Exploratory Data Analysis (EDA), Predictive Modeling (Supervised and Unsupervised), Regression, Classification, Clustering

classification clustering data-analysis data-science data-visualization machine-learning matplotlib numpy pandas python regression-analysis seaborn

Last synced: 13 Apr 2026

https://github.com/ray-chew/pycsam

pyCSAM is a robust approach for approximating geodesic subgrid-scale orographic spectra with applications to weather forecasting and broader data analysis

data-analysis gmted icon-model merit-dem orographic spectral-analysis topography weather-forecast

Last synced: 28 Feb 2025

https://github.com/ascender1729/leetcode_scraper

Extract topic tags from LeetCode problems to streamline interview preparation.

beautifulsoup coding-interview data-analysis graphql leetcode python scraper web-scraping

Last synced: 20 Jun 2026

https://github.com/lijesh010/covid-19_global_analytics_power_bi_project

This repository is a data visualization project that offers an in-depth analysis of the Covid-19 pandemic using Microsoft Power BI. This interactive dashboard provides valuable insights into key metrics related to Covid-19 cases, deaths, recoveries, and more, helping users understand the global impact of the pandemic.

dashboard data-analysis data-visualization powerbi report

Last synced: 08 Jan 2026

https://github.com/gintuvedula/crime-data-analysis-with-mysql-and-python

This project aims to analyze crime data using MySQL for database management and Python for data analysis and visualization. The objective is to uncover crime trends, hotspots, and patterns to support law enforcement and urban planning efforts.

data-analysis data-exploration database mysql python

Last synced: 05 May 2026

https://github.com/anthonytlei/graphsql

Lightweight SQL-to-GraphQL connector for querying GraphQL endpoints using SQL syntax.

connector data-analysis dbapi graphql graphsql python sql sqlalchemy superset

Last synced: 09 Apr 2026

https://github.com/vedanty3/supermarket-sales-data-analysis

This project contains data visualization techniques (using pandas and matplotlib) to explore different aspects of supermarket sales data of 3 months.

data-analysis data-science jupyter-notebook matplotlib numpy pandas python

Last synced: 08 May 2026

https://github.com/luminati-io/Indeed-dataset-samples

A sample dataset of over 1000 Indeed job listings, extracted using the Bright Data API, ideal for market analysis and growth.

api data-analysis datasets indeed jobs web-scraping

Last synced: 09 Apr 2025

https://github.com/luminati-io/Shopee-dataset-samples

A sample dataset of over 1000 Shopee products, extracted using the Bright Data API, ideal for pricing optimization, gap analysis, and market strategy refinement..

api data-analysis data-mining datasets products shopee web-scraping

Last synced: 09 Apr 2025

https://github.com/deliprofesor/virtual-reality-in-education-impact-analysis-and-insights

This project examines the impact of Virtual Reality (VR) on education, focusing on its effects on student engagement, learning outcomes, and creativity. It uses data analysis techniques like descriptive statistics, correlation analysis, and clustering to assess VR's effectiveness in enhancing learning.

clustering data data-analysis data-science data-visualization exploratory-data-analysis hypothesis-testing machine-learning python regression-analysis virtual-reality

Last synced: 14 Jun 2025

https://github.com/shellynagar27/business-insights-360-project

A comprehensive Dashboard which provides better understanding of the business's market standing, key focus areas for optimization, underperforming customers, and year-wise financial insights, aiding in better inventory planning and performance tracking. Further it can be used in answering n number of why questions based on the situations.

dashboard data-analysis data-visualization dax-languague dax-studio excel performance-optimization power-bi reporting sql storage-manager

Last synced: 27 Jan 2026

https://github.com/rupashi03/fitbit-user-eda-case-study

Performed Exploratory Data Analysis (EDA) on Fitbit users' data to uncover trends in activity and health metrics.

business-analysis case-study consumer-insights data-analysis exploratory-data-analysis health-data r user-behavior-analytics

Last synced: 25 Mar 2025

https://github.com/ravi-prakash1907/covid-19-china

A data-science research work to understand the growth rate of the novel Coronavirus.

china coronavirus covid-19 data-analysis data-mining data-science mathematical-modelling project r research research-paper

Last synced: 06 Sep 2025

https://github.com/chiragkumargohil/co2-emissions-data-analysis

A Python programme that analyses CO2 emission data from 1997 to 2010. This programme prints data, provides brief of a given year, displays and compares Year vs. Emission graphs for chosen countries, and generates a separate data file for chosen countries. It was a self-paced project that Guru 99 provided.

co2-emission data-analysis matplotlib python

Last synced: 28 Aug 2025

https://github.com/shibbir24/a-data-driven-approach-to-food-security-and-supermarket-accessibility

A Data-Driven Approach to Food Security and Supermarket Accessibility

data-analysis matplotlib numpy pandas python3 seaborn

Last synced: 13 Apr 2026

https://github.com/pawlo77/smarty

End-to-End Data Science tool

data-analysis data-processing pandas pipeline

Last synced: 08 May 2026

https://github.com/diligencefrozen/dcinside-data

Analyzing the Dcinside Frozen Gallery Dataset. #디시

data-analysis dataset

Last synced: 30 May 2026

https://github.com/fbarffmann/vba-challenge

Built an Excel VBA script to automate stock market analysis across multiple years. Programmatically calculated and visualized key financial metrics, reducing manual reporting time and improving data accuracy.

automation data-analysis excel excel-vba financial-analysis reporting stock-market vba

Last synced: 04 Feb 2026

https://github.com/sadia-khan13/data-preprocessing

Welcome to the Data preprocessing Repository! This repository is dedicated to showcase the comprehensive resources and implementations related to Data Preprocessing using Python and Jupyter Notebook.

artificial-intelligence data-analysis data-mining data-preprocessing data-science jupyter-notebook matplotlib numpy pandas python seaborn-python sklearn

Last synced: 11 Apr 2026

https://github.com/marina-gal/sql-business-questions

A collection of SQL queries designed to strengthen analytical problem-solving skills using the AdventureWorks2019 sample database. tested and optimized in SQL Server Management Studio (SSMS).

adventureworks data-analysis data-analyst interview-preparation learning microsoft-sql-server practice sql sql-queries

Last synced: 30 May 2026

https://github.com/wsu-carbon-lab/ezfit

Fitting in python made dead simple

data-analysis experimental-physics fitting pandas-accessor

Last synced: 14 Jun 2025

https://github.com/balajimohan18/milk-production-time-series-forecasting-datascience-project

This project uses time series forecasting to predict future milk production. The data used in this project is monthly milk production data from January 1962 to December 1975. The ARIMA (autoregressive integrated moving average) model is used to forecast the milk production. The model is evaluated using various metric.

acf adf data-analysis data-cleaning data-science data-visualization eda exploratory-data-analysis machine-learning pacf seasonality time-series trends

Last synced: 30 May 2026

https://github.com/srinibas-masanta/zomato-customer-and-restaurant-analysis

This repository contains a comprehensive analysis of Zomato's platform, focusing on various aspects of customer behavior, restaurant performance, and market trends. The analysis leverages data-driven insights to answer key questions that can guide business strategies, enhance customer satisfaction, and optimize operational efficiency.

business-analytics data-analysis data-science data-visualization

Last synced: 02 Apr 2025

https://github.com/beyzabasarir/brazilian-e-commerce-analysis

Brazilian E-Commerce Dataset By Olist PostgreSQL Analysis

data-analysis data-visualization sql

Last synced: 08 Jan 2026

https://github.com/spacebakery/nba-trends-project

Data Science Foundations I | Exploratory Data Analysis in Python | Summarizing Relationship Between Two Features

categorical-variables data-analysis data-visualization matplotlib nba-dataset quantitative-variables scipy seaborn subset summary-statistics

Last synced: 11 Mar 2025

https://github.com/ankitpoddar07/sqlpizzas-saleproject

🍕 Pizza Sales Analysis with SQL

data-analysis database excel mysql powerbi ppt python

Last synced: 09 May 2026

https://github.com/satyam4229/prediction-of-cement-compressive-strength

Prediction of cement compressive strength is a model which is based on Regression model, Here we predict that how much is the compressive strength of the particular cement has with variety of mixtures of its component.

data-analysis data-science data-visualization jupyter-notebook kaggle python

Last synced: 13 Apr 2026

https://github.com/pratik-khose/realtime-sales-simulation

Power BI: Realtime Sales Simulation using SQL Server and Direct Query

data-analysis data-analytics data-visualization dax-query powerbi sql sql-server sqlserver

Last synced: 10 Jun 2026

https://github.com/juanmerino89/data-job-market-analysis-project

Análisis completo del mercado laboral a través de datos abiertos, scraping y visualizaciones. Proyecto explicado paso a paso en mi canal de YouTube.

career-insights data-analysis data-science job-data job-market jupyter-notebook machine-learning market-trends open-data portfolio-project python salary-analysis visualization web-scraping youtube-project

Last synced: 18 May 2026

https://github.com/satyam4229/prediction-of-different-diseases

Prediction of the different diseases with the help of different symptoms express the diseases in the real time. In the dataset, there are 132+ different symptoms on which the model is trained to give the best result of the disease.

data-analysis data-science data-visualization jupyter-notebook kaggle python

Last synced: 13 Apr 2026

https://github.com/parthshah02/customer_churn_dashboard

This repository features a comprehensive project showcasing data analysis and interactive dashboard using Python

data-analysis matplotlib numpy pandas python

Last synced: 13 Apr 2026

https://github.com/nimomach/cafe-sales

This analysis focuses on evaluating the sales performance of a cafe by examining key metrics such as total revenue, sales by product category, peak sales times, and many more.

cafe data-analysis data-visualization sales

Last synced: 12 Mar 2026

https://github.com/abhipatel35/svm-hyperparameter-optimization-for-breast-cancer

Utilizing SVM for breast cancer classification, this project compares model performance before and after hyperparameter tuning using GridSearchCV. Evaluation metrics like classification report showcase the effectiveness of the optimized model.

breast-cancer cancer-diagnosis classification data-analysis data-science gridsearchcv healthcare hyperparameter-tuning jupyter-notebook machine-learning medical-imaging pycharm python scikit-learn support-vector-machine svm

Last synced: 05 Feb 2026

https://github.com/codesaadumair/pandas_exercises_personal

Personalized enhancements to pandas exercises with comprehensive solutions and practical insights for mastering data analysis in Python.

data-analysis data-science pandas python

Last synced: 09 May 2026

https://github.com/mahmoud2abdallah/improvado-marketing-homework

This Looker Studio dashboard provides a comprehensive analysis of marketing performance for August 2024, transforming raw data into actionable insights for data-driven decision making.

bigquery business-intelligence data-analysis looker-studio marketing

Last synced: 05 Oct 2025

https://github.com/joyceannie/sql-data-with-danny-case-studies

Case study solutions for #8WeekSQLChallenge at https://8weeksqlchallenge.com

8-week-sql-challenge 8weeksqlchallenge case-study data-analysis data-analytics postgresql sql

Last synced: 05 Oct 2025

https://github.com/manishkaa/google_data_analytics_capstone_case_study

This case study is a part of Google Data Analytics Capstone Project

bigquery data-analysis sql tableau

Last synced: 05 Oct 2025

https://github.com/davifeliciano/modern_physics_experiments

Collection of data analysis and visualization scripts developed in Python around some modern physics experiments

data-analysis data-visualization modern-physics physics physics-experiments

Last synced: 18 Jan 2026

https://github.com/dacq-trap/dacq

DacQ Score Server

data-analysis go nextjs

Last synced: 14 Jan 2026