An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/madhursinghbhadoriya/data_analysis_fifa-players

• Using NumPy, Matplotlib, Pandas, etc processed important Information and Characteristic traits on Jupyter Notebook.

analysis data-analysis data-science graphs jupyter-notebook pandas python

Last synced: 07 May 2026

https://github.com/blladerunner/customer-churn-dashboard

Customer Churn Dashboard — SQL + Python analytics project exploring customer retention patterns, churn rate by demographics and services, and key insights for telecom business strategy.

business-intelligence churn-analysis customer-retention dashboard data-analysis data-analytics data-science pandas powerbi python sql sqlite telecom

Last synced: 08 May 2026

https://github.com/devexpress-examples/wpf-pivot-grid-group-date-time-values

This example shows how to group date-time values in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 08 May 2026

https://github.com/rohitblaze10/netflix_analysis_using_tableau

The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.

data data-analysis data-science data-visualization netflix tableau

Last synced: 04 Feb 2026

https://github.com/saroshfarhan/dublin_pedestrian_data_analysis

Pedestrian's footfall data analysis for the city of Dublin

data-analysis data-visualization r-programming

Last synced: 07 Jan 2026

https://github.com/campagnucci/exercitando_pandas

Exercícios práticos de pandas com dados abertos da educação de São Paulo

data-analysis data-science education-data exercises pandas-tutorial

Last synced: 28 Jan 2026

https://github.com/limatix/limatix

Limatix datacollect and processtrak tools

data-analysis python scientific-workflows

Last synced: 23 Jan 2026

https://github.com/wisdom-osborn/data-analytics-course-online-

🔍 Data Analytics with Python — Hands-on Course Materials Jupyter notebooks, projects, and datasets based on the freeCodeCamp Data Analysis with Python certification. Learn NumPy, Pandas, data cleaning, and visualization through real-world examples

data data-analysis data-science data-visualization freecodecamp numpy pandas pandas-dataframe project python

Last synced: 19 Apr 2026

https://github.com/riju18/data-analysis-and-visualizaton

Most complex data analyzing for clustering, preparing, complex calculation, joining, cross-over & more for Data science.

data-analysis data-mining data-science data-visualization powerbi tableau

Last synced: 04 Jan 2026

https://github.com/0290192029/apartment-price-predictor

Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

apartment-price-prediction apartments-for-rent api correios-api data-analysis feature-engineering feature-enginering linear-regression linear-regression-models mlops numpy prediction-model r seaborn

Last synced: 08 May 2026

https://github.com/code-jl/nfl-kicker-predictor

A sophisticated Python application that provides real-time NFL kicker statistics and performance analysis with an intuitive graphical interface.

beautifulsoup data-analysis data-visualization espn football gui nfl prediction python real-time-analytics real-time-data sport-analytics sports-data statistics tkinter web-scraping

Last synced: 01 Jun 2026

https://github.com/sumit-sinha9/sales-analysis

Analyzing 12 months worth fo Sales data

data-analysis pandas python visualization

Last synced: 08 May 2026

https://github.com/alunera-data/alunera-data

Hi, I’m Yvonne – building data solutions at the intersection of BI, SQL & Service Management

business-intelligence data-analysis data-engineering data-science github-profile portfolio rstats sql

Last synced: 28 Jan 2026

https://github.com/garcane/unicorn-companies-analysis

Tracking unicorn startups (valued at $1B+) provides valuable insights for investors and analysts to identify high-growth industries and emerging trends.

data-analysis exploratory-data-analysis financial-analysis investor postgresql sql

Last synced: 24 Jan 2026

https://github.com/diegopino/publibdata_codexhackathon

Public Library Data processing/analysis codex hackathon attempt

data-analysis data-visualization libraries public

Last synced: 24 Jan 2026

https://github.com/valentinoli/swiss-foodprint

Project in Applied Data Analysis, EPFL 2019

carbon-emissions data-analysis diet foodprint swiss switzerland

Last synced: 24 Jan 2026

https://github.com/aneeshmurali-n/global-superstore-sales-dashboard---power-bi-stunning-dark-theme

This Power BI dashboard provides a comprehensive view of sales data, enabling users to analyze sales trends, identify top-performing regions, and gain insights into customer behavior.

dark-theme dashboard data-analysis data-science data-visualization powerbi salesdashboard

Last synced: 28 Jan 2026

https://github.com/noorulhudaajmal/customer-segmentation-analysis

Customer segmentation and analysis of purchasing behaviour

cluster-analysis customer-segmentation data-analysis

Last synced: 07 Oct 2025

https://github.com/rahulchouhan1/sql-data-warehouse-project

Building a modern data warehouse with SQL Server, including ETL Processes, data modeling, and analytics.

data-analysis data-cleaning data-engineering data-science data-warehouse datascience etl etl-pipeline sql sql-query sql-server

Last synced: 24 Jan 2026

https://github.com/hdgiacon/power_bi_projects

Repositório contendo cursos, dashboards e projeto relacionados à análise de dados e Power BI.

data-analysis data-engineering data-visualization microsoft-power-bi

Last synced: 24 Jan 2026

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 16 Mar 2026

https://github.com/hemangsharma/dataanalysis

This repo contains analysis like a dashboard and time series forecast on NASDAQ data

analysis data data-analysis data-visualization python

Last synced: 10 Mar 2026

https://github.com/nilayhangarge/data-analysis-with-python

This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.

data-acquisition data-analysis data-analytics data-binning data-cleaning data-engineering data-fundamentals data-insights data-integration data-preprocessing data-science data-wrangling numpy pandas python

Last synced: 12 Apr 2026

https://github.com/annnieglez/fraud-detection-eda

Fraud Detection - Exploratory Data Analysis (EDA). Analyzing financial transactions to detect fraud patterns using Python and Tableau. Libraries: Pandas, Seaborn and Matplotlib. Key Focus: Data cleaning, fraud trends, high-risk transactions, time-based patterns

data-analysis data-science data-visualization eda fraud-detection fraud-prevention matplotlib seaborn

Last synced: 28 Jan 2026

https://github.com/ehsan-behzadi/online-retail-data-analysis-and-preprocessing

This project analyzes and preprocesses the Online Retail dataset to uncover insights into customer purchasing behaviors, sales trends, and product performance. It includes data cleaning, exploration, and visualization, with the goal of enhancing understanding of online retail dynamics.

cohort-analysis data-analysis data-cleaning data-exploration duplicate-detection exploratory-data-analysis-eda feature-encoding feature-engineering handling-missing-values online-retail outlier-detection preprocessing trends-visualization visualization z-score-method

Last synced: 16 Apr 2026

https://github.com/kseniatyschuk/excel-data-matcher

Compare and match Excel files via a simple Python GUI

automation data-analysis etl excel gui pandas python3 tkinter

Last synced: 23 Apr 2025

https://github.com/tasosfotiadis/time-series-forecasting-for-bitcoin

This project forecasts Bitcoin’s daily closing price using time series models. Data from Jan 2021 to Mar 2022 is processed by converting timestamps, resampling, and handling missing values. LSTM and ARIMA models are evaluated on MAE, RMSE, and MAPE, with LSTM achieving better accuracy while ARIMA is faster in training and inference.

arima bitcoin data data-analysis data-science deep-learning forecasting jupyter-notebook neural-networks python time-series

Last synced: 06 May 2026

https://github.com/srimantapal205/dataengineerwireframedesigns

Data Engineer Wireframe Designs are essential for planning and visualizing data pipelines, architecture, and workflows before implementation.

data-analysis data-engineering dataflow dataflow-programming datapipeline dataprocessing development visualization

Last synced: 29 Jan 2026

https://github.com/andreicirciumaru/best-of-breed

CSV fundamentals screener: schema validation + market-cap weights

csv data-analysis finance pandas python screener

Last synced: 15 Apr 2026

https://github.com/angchekar28/sales-report-power-bi

A Power BI sales report analyzing country-wise and product-wise sales trends. Includes dashboards, decomposition trees, and key influencers analysis for business insights.

dashboard data-analysis data-cleaning data-visualization powerbi sales-report

Last synced: 16 Mar 2026

https://github.com/engineertolulope/us_states_living_ranking_analysis

Python script for analyzing and ranking U.S. states based on factors like cost of living, tax burden, diversity, crime rates, and climate. Uses weighted criteria to identify the best states to live in according to these metrics. Ideal for decision-making on relocation.

data-analysis data-science linear-regression machine-learning python scikit-learn

Last synced: 29 Jan 2026

https://github.com/wareflowx/excel-toolkit

A powerful command-line toolkit for Excel and CSV data manipulation, analysis, and transformation.

data-analysis data-wrangling excel pandas python uv

Last synced: 29 Jan 2026

https://github.com/sabaasif2501/netflix-data-analysis

Exploratory data analysis of Netflix content using Python and pandas. Content types, genres, countries, and release years.

data-analysis netflix pandas portfolio-project python

Last synced: 08 May 2026

https://github.com/satvikpraveen/numpymasterpro

A hands-on, production-ready toolkit to master NumPy — from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.

broadcasting data-analysis data-science data-source data-visualization jupyter-notebook kmeans-clustering linear-algebra machine-learning matrix-algebra numerical-computation numpy numpy-broadcasting numpy-examples numpy-tutorial open-source python scientific-computing standardization vectorization

Last synced: 08 May 2026

https://github.com/abhi227070/medical-insurance-predictor

This project implements a machine learning regression model to predict medical insurance charges based on user-provided details such as smoking status, number of children, gender, and age. The user-friendly interface allows individuals to estimate their average insurance price before purchasing medical insurance.

data-analysis machine-learning machine-learning-algorithms machinelearning python3 regression-models

Last synced: 04 May 2026

https://github.com/smahala02/magnetism-lab

This repository contains Python scripts and data for analyzing inductance in toroidal coils to calculate the magnetic permeability of ferrite materials. The project helps classify materials as soft or hard magnets based on experimental data.

data-analysis inductance jupyter-notebook magnetism python toroids

Last synced: 29 Jan 2026

https://github.com/joannescode/regex_with_py

Learning by practicing with Regex (Python)

data-analysis python3 regex

Last synced: 30 Jan 2026

https://github.com/grooviter/tablesaw

Java dataframe and visualization library

data-analysis dataframe java visualization

Last synced: 28 Mar 2025

https://github.com/mfakhriazhar/us-companies-revenue-dashboard

This project is a data visualization dashboard built using Power BI that highlights lists of the largest companies in the United States by revenue. The goal is to provide an interactive overview of company performance across industries, focusing on revenue, employee metrics, and industry trends.

dashboard data-analysis data-visualization largest-companies-us powerbi revenue united-states

Last synced: 30 Jan 2026

https://github.com/borjamome/soho_cholera

Cholera deaths in the Soho District (London)

data-analysis data-visualization london r

Last synced: 04 Sep 2025

https://github.com/ljadhav25/decision-tree-random-forest-algorithm-data-science-

This repository contains an implementation of decision tree and random forest algorithms from scratch in Python. Decision trees and random forests are popular machine learning algorithms used for classification and regression tasks. The goal of this project is to provide a clear and understandable implementation of these algorithms

data-analysis data-science decision-trees machine-learning-algorithms matplotlib numpy pandas python random-forest-classifier

Last synced: 15 Apr 2026

https://github.com/aygp-dr/values-compass

Tools for exploring and analyzing Anthropic's Values-in-the-Wild dataset for AI ethics research

ai-ethics anthropic-claude data-analysis nlp values

Last synced: 25 Feb 2026

https://github.com/manishabarse/hr_data_analysis

Used Microsoft SQL Server Management Studio and Power BI

data-analysis powerbi sql ssms

Last synced: 30 Jan 2026

https://github.com/jaseel342/ecommerce_sales_dashboard

The E-commerce Sales Dashboard project offers a comprehensive view of e-commerce sales performance using interactive Power BI dashboards. It focuses on key metrics like YTD Sales, YTD Profit, YTD Profit Margin, and Quantity of Products sold, analyzing data by product categories, states, and regions.

data-analysis data-modelling dax-expression excel power-query powerbi visualization

Last synced: 07 Feb 2026

https://github.com/ibttf/bayborhood

Interactive map to find the ideal neighborhood in San Francisco based on data.

data data-analysis data-visualization gis mapbox react

Last synced: 18 Jun 2026

https://github.com/sarveshdhond/top_25_cad_stocks

In this project I have used Python Jupyter lab and Pandas to import data set from Yahoo stocks website. I have imported the top 25 most active Canadian stocks on 12th July 2024. This project shows skills such as Python, Web Scrapping and Pandas.

data-analysis pandas-dataframe python webscraping

Last synced: 01 Apr 2025

https://github.com/skivhisink/econometricnsu

Семестровый магистерский курс по эконометрике на первом курсе магистратуры экономического факультета НГУ

data-analysis econometrics economics education nsu r

Last synced: 09 Apr 2025

https://github.com/luminati-io/indeed-dataset-samples

A sample dataset of over 1000 Indeed job listings, extracted using the Bright Data API, ideal for market analysis and growth.

api data-analysis datasets indeed jobs web-scraping

Last synced: 07 Feb 2026

https://github.com/mpoojithavigneswari/bangalore-house-price-prediction

This project involves creating a website that predicts Bangalore house prices with 94.65% accuracy using a machine learning algorithm.

data-analysis data-science flask-server machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/jujulis18/olympicsmedalsdashboard

Olympic Dashboard – Paris 2024 est un tableau de bord interactif permettant d’explorer les performances des athlètes médaillés des Jeux Olympiques d’été de Paris 2024.

dashboard data-analysis data-visualization eda olympic python streamlit

Last synced: 31 Jan 2026

https://github.com/traore-07/fedex-sales-analysis

Analysis of the FedEx Sales Transaction

data-analysis data-visualization sales-analysis tabeau

Last synced: 31 Jan 2026

https://github.com/deepanshkhurana/udacityproject-prediciting-boston-housing-prices

This is a Udacity Project for the Machine Learning Nanodegree. Here, we are trying to predict Boston Housing Prices using sklearn.

data-analysis data-science machine-learning python scikit-learn udacity

Last synced: 08 May 2026

https://github.com/shafaq-aslam/pandas-lab

A comprehensive collection of Jupyter notebooks exploring Pandas, from Series and DataFrames to data cleaning, aggregation, merging, and visualization. A complete hands-on guide for mastering data manipulation and analysis with Python.

analytics data-analysis data-cleaning data-science data-visualization dataframe jupyter-notebook machine-learning pandas pandas-dataframe pandas-library pandas-series python python3 series

Last synced: 15 Apr 2026

https://github.com/allanotieno254/bank-loan-analysis-dashboard-power-bi

An interactive Power BI dashboard that analyzes bank loan data to provide insights into approval trends, default risks, and customer profiles. Designed to assist financial institutions in making data-driven lending decisions.

bank-loans business-intelligence dashboard data-analysis financial-analysis power-bi risk-assessment

Last synced: 31 Jan 2026

https://github.com/tanaybhadula/twitter-trends-dashboard

An interactive dashboard to visualizes data on current Twitter trends by country and globally. Collects data of over 60 countries using the python Tweepy library, processed it,and visualized it in the form of bar chart and pie chart using the Plotly Dash framework.

dash dashboard data-analysis data-visualization plotly python trends twitter

Last synced: 31 May 2026

https://github.com/apoorvalal/misc_stata_ados

Misc Utility programs in Stata.

data-analysis stata stata-command

Last synced: 04 Feb 2026

https://github.com/steviecurran/gbt-scripts

IDL scripts for the reduction of Green Bank Telescope data

data-analysis data-compression data-visualization radio-astronomy spectroscopy

Last synced: 31 Jan 2026

https://github.com/noodleslove/house-of-representatives-analysis-ii

In this project, we want to estimate if a transaction will have capital gains exceeding $200 using the provided dataset.

coursework data-analysis data-science eda feature-engineering pandas python3

Last synced: 12 Apr 2026

https://github.com/nikitalpopov/evotor_champ

solution for evotor data challenge

data-analysis data-science python scikit-learn

Last synced: 15 Apr 2026

https://github.com/salma-mamdoh/investigating-netflix-movies-and-guest-stars-in-the-office

My Project to learn the Basics of Analysis & Visualization on DataCamp

data-analysis data-visualization datacamp matplotlib pandas python

Last synced: 11 Apr 2026

https://github.com/axsk/geekgraph

parse, cluster and visualize boardgamegeek.com user profiles

data-analysis scraper

Last synced: 01 Feb 2026

https://github.com/bineet-ratna-shakya/data-science-salary-analysis

analyzing a dataset containing salaries of data science professionals from 2020 to 2023.

data-analysis data-science data-visualization jupyter numpy pandas python

Last synced: 01 Feb 2026

https://github.com/quocduyenanhnguyen/roi-modeling-and-analysis-of-sports-dataset

In this project, you will find my ROI model for retirement savings and PowerPoint presentation of my ROI model, as well as my data analysis/visualization of Sports Ticket Sales dataset that I concluded with a PDF group written report

data-analysis data-visualization microsoft-excel rate-of-return-modeling sports-ticket-sales-dataset

Last synced: 08 Feb 2026

https://github.com/vevdokimovm/python-course-notebooks

Python course practice scripts, Jupyter notebooks and deep learning exercises from Grokking Deep Learning

data-analysis deep-learning jupyter python

Last synced: 27 Jun 2026

https://github.com/rissh/titanicsurvivalpredictionusingml

Predicting Titanic passenger survival through machine learning. This project includes data preprocessing, exploratory data analysis, feature engineering, and model training using Python. 🚢

data data-analysis data-science data-visualization dataanalysis jupiter-notebook machine-learning machine-learning-algorithms machinelearning matplotlib numpy pandas prediction prediction-model python python3 seaborn tenserflow tflearn titanic

Last synced: 01 Feb 2026

https://github.com/keneandita/exploratory-data-analysis-eda-

Explore EDA on 5 datasets: Titanic 🚢, Heart Disease ❤️, Wine Quality 🍷, Car Price 🚗, and NBA Players 🏀. Includes data cleaning, preprocessing, and visualizations to uncover insights. Perfect for beginners to learn data analysis with Pandas, Matplotlib, and Seaborn! 🎨📈

data-analysis data-visualization eda matplotlib pandas python seaborn sklearn

Last synced: 15 Apr 2026

https://github.com/yash-3-bit/online-sales-analysis

Project-Merging the different months datasets and performing the data cleaning ,Analysis and Visualization

data-analysis data-visualization pandas-library

Last synced: 27 Mar 2025

https://github.com/soumya-thoutam/covid-19-impact-on-u.s.-states-and-colleges

Covid-19 analysis and impact on United States Colleges and States using SQL and Tableau.

covid-19 dashboard data-analysis data-visualization dataset sql sql-server tableau

Last synced: 04 Sep 2025

https://github.com/vishnu-vamshii/data-science-jobs-salaries

Created an interactive dashboard to analyze data science jobs salaries in different regions of the world, experience levels, average salaries in USD and type of employment along with a geographical visual.

data-analysis data-science data-visualization tableau tableau-dashboard

Last synced: 01 Feb 2026

https://github.com/jesuserro/ab-testing-ui-redesign-vanguard

A/B testing analysis to evaluate the impact of a user interface redesign at Vanguard.

a-b-testing data-analysis eda exploratory-data-analysis testing ui-design ux-design

Last synced: 08 Jul 2025

https://github.com/yeuner/file-analysis-sql-demo

Streamlit-based application that leverages pandas, sqlite3, and file handling libraries (OpenPyXL and PyArrow) to practice SQL queries, analyze datasets, and export results. A personal project to enhance Python and SQL skills.

data-analysis dataset pandas sql sqlite streamlit vizualization

Last synced: 15 Apr 2026

https://github.com/filip-kustura/statistics-olympics-analysis

A group seminar analyzing the relationship between citizens' average height and a country's Olympic success. The project involved data collection, descriptive statistics and statistical testing. Created and presented as part of the mandatory undergraduate Statistics course in spring 2021.

correlation-analysis data-analysis data-visualization descriptive-statistics group-project hypothesis-testing olympic-games r-programming research sports-analytics statistical-testing statistics university-project

Last synced: 05 Jan 2026

https://github.com/vladimiracunadev-create/python-data-science-program

Python Data Science Program — 197 clases en 9 partes. Pauta avanzada derivada de Géron, VanderPlas, Huyen, ISLP y Barocas/Hardt/Narayanan. Recurso personal de aprendizaje, enseñanza y mejora continua.

bootcamp data-analysis data-science education jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 01 Jun 2026

https://github.com/mostafa-ghorab/global-happiness-analysis

An analysis of global happiness rankings based on various factors like GDP, family support, health, and freedom from the World Happiness Report (2015-2017). This project provides data visualizations and statistical insights into how these factors influence happiness scores in different regions.

business-analysis data-analysis data-visualization matplotlib numpy pandas python seaborn

Last synced: 12 Apr 2026

https://github.com/pranav016/exploratory-data-analysis-of-sp500-dataset

This a data-analysis that I performed on the S&P 500 dataset and answered a few questions through data visualization techniques.

data-analysis

Last synced: 30 Oct 2025

https://github.com/suhail25/hotel-booking-analysis

Analyzed the cancelling of booking of hotels and summarized insights to the Hotel Manager to increase profit by 30%. Demonstrated data exploration, cleaning, analysis using Python and its libraries: pandas, seaborn, matplot. Documented the results in PDF report: reduced cancellation by 30% and releasing discounts for 10 days in a month.

data-analysis ipynb-notebook matplotlib pandas python seaborn

Last synced: 08 Feb 2026

https://github.com/mrgeislinger/bike-data-exploration

Data exploration of bike-related data

bicycle bike data-analysis data-science

Last synced: 08 Feb 2026

https://github.com/shellynagar27/transportation-and-logistics-challenge

Analyzing logistics data to optimize shipment efficiency, reduce delays, and enhance supply chain visibility using Power BI. Insights include top routes, delays, supplier trends, and peak shipments.

cleaning-data critical-thinking data-analysis data-visualization exploratory-data-analysis feature-engineering powerbi preprocessing-data problem-solving python

Last synced: 16 May 2026

https://github.com/sroman0/data-analytics

Data Analytics Exercises is a collection of comprehensive university-level exercises aimed at enhancing skills in data analytics. The repository includes practical notebooks covering data manipulation, exploratory data analysis (EDA), statistical analysis, data visualization, and machine learning fundamentals.

data-analysis data-analytics data-science data-visualization education exercises exploratory-data-analysis hands-on-practice jupyter-notebook machine-learning python statistics

Last synced: 15 Apr 2026