An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/vyjayanthipolapragada/marketing_statistical_analysis

Statistical analysis of customer data and their impact on the sales of products based on marketing campaigns

customer-data data-analysis dataframes marketing matplotlib numpy pandas python seaborn statistical-analysis

Last synced: 11 Apr 2026

https://github.com/tolumie/exploratory-data-analytics-projects

Exploratory Data Analytics – A collection of projects covering data exploration, feature engineering, hypothesis testing, and predictive modeling across diverse datasets, including insurance, real estate, laptops, cars, COVID-19, and the Olympics.

data-analysis data-visualization data-wrangling exploratory-data-analysis-eda feature-engineering hypothesis-testing machine-learning matplotlib numpy pandas predictive-modeling python seaborn statistical-analysis

Last synced: 11 Apr 2026

https://github.com/ilovenooodles/probstat-water-potability

Tugas Besar Probabilitas dan Statistika 1

csv data-analysis jupyter-notebooks python

Last synced: 03 May 2026

https://github.com/chardyb/prob-and-stats-bmi6106

A repository for Spring 2025 BMI 6106: Statistics and Probability. This repository contains coursework, code examples, and projects exploring statistical methods and probabilistic models in biomedical informatics.

biomedical-informatics data-analysis data-science probability r statistical-modeling

Last synced: 02 Sep 2025

https://github.com/debjyotisaha/power-bi-projects-phase-1

Portfolio projects related to data visualisation in Power BI

data-analysis data-visualization dax-expression powerbi powerquery

Last synced: 18 Jan 2026

https://github.com/ladaegorova18/data_analysis

Learning the basics of data analysis in Python

analytics data-analysis data-visualization steam-games

Last synced: 16 May 2025

https://github.com/chitranjan806/predicting-on-time-premium-deposits

A Predictive analysis project to predict the success rate of On-Time deposits of Premiums by Policy Holders.

analytics-vidhya analytics-vidhya-competition catboostregressor data-analysis data-science linear-regression logistic-regression python3

Last synced: 16 May 2026

https://github.com/nimomach/skateboarding-in-olympics

Skateboarding made its debut in Olympics at the 2020 Summer Olympics. This is a dashboard focused on "Skateboarding in the Olympics" representing a comprehensive overview of the sport's performance, popularity, and key metrics during the Olympic Games.

data-analysis data-visualization olympics paris skateboarding tokyo

Last synced: 10 Mar 2026

https://github.com/shru924/ecommerce_customer_behavior_analysis

A machine learning project that analyzes and segments e-commerce customers based on behavior patterns using Python, Random Forest, and data visualization.

customer-segmentation data-analysis jupyter-notebook machine-learning matplotlib pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/shellynagar27/mobile-sales-analysis

Analyzed 2024 mobile sales data to uncover product trends, customer behavior, and regional insights using Power BI dashboards and structured data modeling.

cleaning-data data-analysis data-visualization dax eda figma modelling powerbi powerquery storytelling wireframe

Last synced: 16 May 2025

https://github.com/rosanafss/r-journey

Diving into to wonderful see of DATA

data-analysis r

Last synced: 19 Nov 2025

https://github.com/salma-mamdoh/investigating-netflix-movies-and-guest-stars-in-the-office

My Project to learn the Basics of Analysis & Visualization on DataCamp

data-analysis data-visualization datacamp matplotlib pandas python

Last synced: 11 Apr 2026

https://github.com/rijul007/market-basket-analysis-using-r

Market Basket Analysis using association rules, leveraging R’s powerful tools for data-driven retail strategies.

data-analysis data-science r

Last synced: 02 Apr 2025

https://github.com/dcs-training/pca-2023

PCA workshop. In this repo, you are going to find the code and files we are going to use for the practical part of the workshop, together with the ppt associated with this training

data-analysis data-visualisation data-wrangling r statistics

Last synced: 20 Jun 2026

https://github.com/annnieglez/computer-vision-parking-lot

This project leverages computer vision techniques to analyze parking lot occupancy. The goal is to detect available parking spaces in real-time using image and video input.

computer-vision data-analysis data-science data-visualization google-colab image-classification image-processing machine-learning python transfer-learning

Last synced: 15 May 2026

https://github.com/anderson-andre-p/wine-data-analysis

This repository contains a data analysis project that focuses on a series of wine data. The project was completed using Python libraries such as NumPy, Pandas, Seaborn, and Matplotlib. The goal of this project was to gain insights into the characteristics of the wines and to practice data analysis skills.

data-analysis data-science data-science-portfolio pandas-dataframe wine-dataset

Last synced: 15 Mar 2025

https://github.com/anderson-andre-p/exploratory-data-analysis.roller-coaster

This repository contains an exploratory data analysis (EDA) project focused on roller coasters. The project involved organizing, cleaning, and visualizing the data to gain insights into roller coasters' characteristics and performance.

data-analysis eda exploratory-data-analysis exploratory-data-visualizations notebook

Last synced: 15 Mar 2025

https://github.com/erickchacon/day2day

Functions that can be useful in the day-to-day data analysis. It comprehends functions to find paths for projects, make summaries of databases inside folder and so on.

data-analysis exploratory-data-analysis simulation spatial-analysis

Last synced: 02 Sep 2025

https://github.com/shafaq-aslam/data-analytics-dairy

A comprehensive repository for Data Analytics learning and projects. It includes MySQL, Python, Power BI, Tableau, and Excel. The goal is to analyze data, generate insights, and create compelling visualizations for real-world datasets.

data-analysis data-visualization excel excel-based-data-analysis powerbi python-scripts sql sql-queries sql-queries-for-data-manipulation sql-query-for-data-visualization tableau

Last synced: 20 Jan 2026

https://github.com/kathisnehith/realestate-sales-analysis

Investigating real estate sales trends to understand market dynamics and inform investment decisions.

data-analysis excel realestate sales sql stastical-analysis-tools tableau

Last synced: 12 Feb 2026

https://github.com/hfzdzakii/dicoding-solvinghrproblem

This repo is a master submission for my Dicoding Final Project. Employee Attrition & Performance Dataset was being used to fulfill the submission. Feel free to explore and I hope my work give you some insight!

data-analysis data-visualization

Last synced: 16 May 2025

https://github.com/virajbhutada/credit-card-transaction-analysis-sql

This project provides a structured database schema and SQL scripts to analyze credit card data. It includes tools for managing and analyzing transaction data, helping to identify spending patterns and trends. The project features visual schema diagrams and supporting documentation for easy understanding.

creditcard customer data-analysis data-cleaning data-modeling database database-management insights performance-optimization postgresql query-language schema-design schema-diagram scripts sql transactions trends

Last synced: 15 May 2026

https://github.com/kzon94/torn-market-analyzer

Streamlit app that parses Torn Add Listing text, matches items with a custom dictionary, fetches market data via the public API, and generates KPIs and price recommendations using a modular Python analytics pipeline.

data-analysis data-engineering fuzzy-matching market-analytics numpy pandas python streamlit torn-city torn-city-api

Last synced: 11 Apr 2026

https://github.com/vriv06/btk-trials-data-analysis

Data analysis of Bioteksa plant nutrition trials for measure nutrient efficacy, resistance against biotic and abiotic factors, etc.

agriculture-research confluence crops data-analysis quarto r

Last synced: 23 Mar 2025

https://github.com/iliyasalve/tiktok_claim_classification_model

Develop a predictive model for classifying videos with claims to reduce the backlog of user reports and optimize the content moderation process.

data-analysis machine-learning python regression-models tiktok

Last synced: 21 May 2026

https://github.com/mborrillo/ranking-ciudades-espana

Sistema end-to-end de análisis multicriterio que evalúa 50 ciudades españolas en calidad de vida mediante datos oficiales

business-intelligence data-analysis multi-criteria-decision-analysis pandas python3 quality-of-life ranking-system scikit-learn scoring-models

Last synced: 13 Jan 2026

https://github.com/bagusperdanay7/fcc-da-mean-variance-standard-deviation-calculator

One of Data Analysis with Python (freecodecamp) task, created a Mean Variance Standard Deviation Calculator.

data-analysis freecodecamp-project numpy python

Last synced: 06 May 2026

https://github.com/deeksha-dhawan/pizza-outlet-analysis-using-sql

This project analyzes pizza sales data to gain insights into customer behavior and revenue patterns. Key analyses include customer insights, popular pizza types and sizes, revenue generation, and order trends. The findings help optimize menu offerings, staffing, and marketing strategies to boost overall business performance.

coding-challenge data-analysis data-science microsoft my portfolio-project programming project projects sql sql-analysis sql-project sqlproject sqlserver

Last synced: 23 Mar 2025

https://github.com/ezmiller/esd-viz

Visualization of European Social Survey (http://www.europeansocialsurvey.org/data/)

clojure data-analysis visualization

Last synced: 28 May 2026

https://github.com/madhursinghbhadoriya/data_analysis_sales_insights_using_tableau

• Performed Data Cleaning using MySQL. • Data analysis and ETL in Tableau. • Created an Interactive Dashboard with significant information about the Sales Insights, Profit and Revenue Analysis.

data-analysis data-visualization dataanalysis etl mysql tableau-dashboards tableau-desktop

Last synced: 09 Apr 2025

https://github.com/82luli02/sakila_dvd_rental_database_analysis

Analysis of the Sakila DVD Rental database using SQL

data data-analysis data-science data-visualization sql

Last synced: 10 Mar 2026

https://github.com/fatihilhan42/eda-spacex-launches-falcon9-and-falcon-heavy

In this project, we analyze the space flight data of Spacex space research company Falcon 9 rocket.

data-analysis data-science data-visualization eda elonmusk spacex

Last synced: 23 Mar 2025

https://github.com/wittyicon29/kritika-iit-b-2023

Seletcion task for the summer projects of Kritika IIT-B

data data-analysis data-science

Last synced: 15 Mar 2025

https://github.com/manisharora96/instagram-reach-analysis

This project provides a detailed approach to analyzing Instagram reach and engagement metrics. By leveraging the code and tools shared here, you can gain valuable insights into your Instagram content's performance and optimize your strategy to grow your audience effectively

data-analysis data-visualization instagram-reach python-tools

Last synced: 23 Mar 2025

https://github.com/kernelshreyak/kaggle-notebooks

Collection of my Kaggle notebooks for data analysis and machine learning on a variety of datasets

data-analysis data-science data-visualization kaggle kaggle-competition machine-learning

Last synced: 27 Apr 2026

https://github.com/nishumehta/retail-sales-analysis

Retail sales performance analysis using Python and Power BI.

data-analysis ipynb-notebook jupyter-notebook powerbi python

Last synced: 15 May 2026

https://github.com/prakashjha1/whatsapp-chat-analyzer

WhatsApp Analyzer means we are analyzing our WhatsApp group activities. It tracks our conversation and analyses how much time we are spending or saying it as “wasting” on WhatsApp.

data-analysis data-science natural-language-processing pandas pyhton regular-expression

Last synced: 15 May 2026

https://github.com/omkar2503/credit-risk-dashboard

A SQL-based Credit Risk Scoring System visualized using Metabase

credit-risk dashboard data-analysis data-analytics metabase postgresql sql

Last synced: 01 Jul 2025

https://github.com/nikhil-donthusaram/heartdiseaseprediction

Heart Disease Prediction App is a machine learning web application that predicts the likelihood of heart disease based on user medical inputs. Built using a Decision Tree Classifier and deployed with Streamlit for an interactive, user-friendly interface.

data-analysis descision-tree joblib jupyter-notebook machine-learning matplotlib numpy pandas python3 seaborn sklearn streamlit vscode

Last synced: 11 Apr 2026

https://github.com/steviecurran/dashboards

Compilation of Links to the dashboards in the other repositories

dashboard data-analysis data-science data-visualization pandas powerbi python-dash tableau

Last synced: 21 Feb 2026

https://github.com/felpzreiz/stockdata_pipeline

Este projeto consiste no desenvolvimento de um pipeline de dados que consome informações financeiras de uma API da Bolsa de Valores Americana (StockData.org) para análise e tratamento. Utilizando Python e bibliotecas como pandas, matplotlib e pyarrow

api data-analysis data-science jupyter-notebook pandas python

Last synced: 19 Apr 2026

https://github.com/walid0912/rfm_analysis

RFM Analysis is employed to comprehend and categorize customers according to their purchasing patterns. RFM, an acronym for recency, frequency, and monetary value, comprises three essential metrics that offer insights into customer involvement, allegiance, and significance to a business.

data-analysis data-visualization python rfm-analysis

Last synced: 02 Sep 2025

https://github.com/mohammad-malik/covid-visualizations-d3

This project provides a dashboard with five different perspectives on the pandemic, from patient-infection relationships to regional trends and hierarchical distributions. This was developed as part of a project for the course Data Analysis and Visualization (DS3001).

covid-19 d3 d3-visualization d3js data data-analysis data-analytics data-science visualization

Last synced: 28 May 2026

https://github.com/vara-co/solar-eclipse-2024

Group Project on the 2024 Solar Eclipse's Path over the US with an interactive map and a couple of visualizations on the data gathered.

data-analysis data-visualizations html-css-javascript interactive-map javascript map solar-eclipse

Last synced: 15 May 2026

https://github.com/annnieglez/nlp-stock-market-and-news

This project focuses on detecting fake news from news headlines using advanced Natural Language Processing (NLP) techniques. It combines sentiment analysis with news headlines embeddings, generated from Hugging Face transformer models, to train a binary classification model that distinguishes between real and fake news.

classification-model data-analysis embeddings machine-learning machine-learning-models nlp nlp-deep-learning nlp-machine-learning python scraping-websites sentiment-analysis

Last synced: 25 Apr 2026

https://github.com/k31ner/inmopipeline

Proyecto integral de análisis y modelado predictivo de datos inmobiliarios, que abarca recolección, transformación, visualización y machine learning utilizando Python y herramientas modernas de ingeniería y ciencia de datos.

data-analysis data-engineering data-science fastapi python streamlit

Last synced: 08 May 2026

https://github.com/anas436/data-science-projects

Explore my diverse collection of projects showcasing machine learning, data analysis, and more. Organized by project, each directory contains code, datasets, documentation, and resources. Dive in to discover insights and techniques in data science. Reach out for collaborations and feedback.

data-analysis data-science machine-learning

Last synced: 27 Mar 2025

https://github.com/abdullahashfaqvirk/powerbi-dashboards

A collection of Microsoft Power BI dashboards and reports designed to address business challenges and support data driven decision-making.

dashboards data-analysis data-driven data-science microsoft powerbi reports visualization

Last synced: 10 Mar 2026

https://github.com/kailenroa/dashboad-excel-huisprijzen

This project focuses on developing a dashboard powered by Funda to visualize house pricing in the Netherlands. The dashboard simplifies the home-buying process by allowing users to compare prices, energy labels, number of rooms, and square meters across different provinces, all in one interactive platform..

dashboard data-analysis excel house-prices

Last synced: 05 Jan 2026

https://github.com/abelarduu/power_bi_analyst

Projeto Power BI para relatório de dados financeiros, com navegação intuitiva e recursos interativos. Oferece uma experiência completa ao usuário, combinando apresentação sofisticada e funcionalidade eficaz para análise de dados.

dashboard data-analysis data-analytics modelagem-de-dados powerbi tratamento-de-dados

Last synced: 08 Sep 2025

https://github.com/devexpress-examples/wpf-pivot-grid-define-custom-cell-template-to-performing-data-editing

This example shows how to edit a cell with the cell editing template in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 02 May 2026

https://github.com/devexpress-examples/wpf-pivot-grid-connect-to-an-olap-datasource

This example shows how to specify connection settings to the server and create fields that relate to specific measures and dimensions of the cube for the Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf xpf

Last synced: 06 May 2026

https://github.com/adilshamim8/eda-on-health-and-sleep-data

Exploratory Data Analysis (EDA) on health and sleep data, uncovering patterns and insights using Python and visualization tools.

data-analysis data-visualization eda health healthcare sleep sleep-analysis

Last synced: 15 Mar 2025

https://github.com/shahriarha/sql

Structured query language

data-analysis mysql mysql-database sql

Last synced: 02 Sep 2025

https://github.com/mnoalett/cscrawler

BSc degree thesis - crawler for www.couchsurfing.org

bsc-thesis couchsurfing crawler data-analysis database python

Last synced: 02 May 2026

https://github.com/alphatwirl/qtwirl

qtwirl (quick-twirl), one-function interface to AlphaTwirl

alphatwirl data-analysis data-frame pandas r root-cern

Last synced: 11 Apr 2026

https://github.com/reddyprasade/r-program

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

data-analysis data-science r-programming

Last synced: 11 Apr 2026

https://github.com/syed-amjad-ali/-bank-churn-ml

Predicting bank customer churn using machine learning. This project includes exploratory data analysis (EDA), feature engineering, classification models (Logistic Regression, Random Forest), and customer segmentation using K-Means clustering.

classification data-analysis data-science eda jupyter-notebook k-means-clustering machine-learning ml python segmentation

Last synced: 09 Mar 2025

https://github.com/ginalamp/covid_dashboard_twitternews

Corona Dashboard & report based on Twitter media outlet news.

dashboard data-analysis data-visualization twitter

Last synced: 28 Jan 2026

https://github.com/tszon/data-science-projects

Included are all the worth-noting Data Science projects in my learning journey with DataCamp.

data-analysis data-science exploratory-data-analysis feature-engineering machine-learning modelling preprocessing-data scikit-learn supervised-learning

Last synced: 15 Mar 2025

https://github.com/aalkiyumi/project-3-docker-container-for-data-processing-script

This Dockerized Python application analyzes two text files (IF.txt and AlwaysRememberUsThisWay.txt). It counts total words, identifies the largest file, and finds the top three most frequent words in each. Results are saved to an output file and printed to the console.

cs5165 data-analysis data-engineering data-science docker introduction-to-cloud-computing statistical-analysis text-processing uc uc2026 university-of-cincinnati

Last synced: 17 May 2026

https://github.com/thbaylson/datascience

All of my past data science assignments put into one singular notebook. Most of this comes from my Machine Learning course.

data-analysis data-science data-visualization decision-tree jupyter-notebook k-nearest-neighbors linear-regression machine-learning neural-network pandas-library python3 scikit-learn

Last synced: 09 May 2026

https://github.com/lit26/novel-corona-virus-2019

Data Analysis for Novel Corona Virus 2019

analysis coronavirus-case data-analysis sir-model

Last synced: 10 Jun 2025

https://github.com/prakashjha1/new-analysis-using-llm-locally

An interactive news analysis tool built with Streamlit and local LLMs. This app allows users to analyze and gain insights from the latest news articles using advanced language models, all running locally. Explore trends, sentiment, and key topics with an intuitive interface.

artificial-intelligence data-analysis data-science llms ollama python streamlit

Last synced: 14 Mar 2025

https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas

SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.

data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase

Last synced: 18 May 2026

https://github.com/mikeesto/ausvotes19

:bird: A collection of 67,284 public tweets published on the night of the 2019 Australian election

australia data-analysis data-visualization elections open-data twitter

Last synced: 06 Apr 2025

https://github.com/lucaso21/euro-2021-player-stats-analysis

A short project analyzing stats for players at the Euro 2021 tournament.

data-analysis data-science r rvest tidyverse

Last synced: 16 Mar 2025

https://github.com/lfariello/atmospheric_reentry

Matlab code for the determination of the reentry trajectory, deceleration profiles, and heat flux of the ARD capsule during orbital reentry into Earth's atmosphere.

data-analysis heat-flux-prediction heat-transfer hypersonic hypersonic-capsule matlab-programming trajectory-prediction

Last synced: 23 Mar 2025

https://github.com/tolumie/rfm-marketing-analysis

This project focuses on RFM (Recency, Frequency, and Monetary) Analysis, a powerful customer segmentation technique used in marketing and business analytics. The analysis helps businesses identify their most valuable customers, potential loyalists, at-risk customers, and churned users.

business-analytics customer-behavior-analysis customer-loyalty customer-retention customer-segmentation-analysis data-analysis data-driven-decisions ecommerce marketing-analytics python

Last synced: 18 May 2026

https://github.com/hemangsharma/job-tracker

A comprehensive Streamlit application for tracking and analyzing job applications.

data-analysis python streamlit-dashboard streamlit-webapp

Last synced: 15 Mar 2025

https://github.com/manel15279/datamining-project

A university project that aims to explore various data mining techniques like Data Exploration, Association Rule Mining, Supervised and Unsupervised Learning, applied to real-world datasets, focusing on soil fertility analysis and COVID-19 cases evolution over time.

covid-19 data-analysis data-mining data-visualization datascience gradio machine-learning python soil-properties

Last synced: 10 Jun 2025

https://github.com/b-varun-reddy/fairwai-bias-detection

Submission for the FairwAI Hospitality Intern Challenge. This project analyzes bias signals in Yelp hospitality reviews using open-source data, Python, and fairness-focused keyword detection.

bias-detection data-analysis ethical-ai fairness hospitality machine-learning natural-language-processing python social-impact yelp-dataset

Last synced: 19 Apr 2025

https://github.com/lunafrost-lab/berry-donut

Exploring berry combinations to produce Donut in Pokémon Legends: Z-A: Mega Dimensions.

data-analysis data-filtering parquet pokemon winforms

Last synced: 13 Jan 2026

https://github.com/atharvapathak/rsvp_movies_case_study

SQL queries performed on IMDb database to provide recommendations to RSVP Movies based on insights.

data-analysis data-cleaning data-science imdb-dataset rsvp-movies sql

Last synced: 28 Jan 2026

https://github.com/gabrieladados/analise-ecommerce

Análise SQL para E-commerce: Estratégias de Crescimento para Impulsionar Vendas

bigquery data-analysis ecommerce sql

Last synced: 31 Mar 2025

https://github.com/wojtekdomino/titanic-eda

Exploratory Data Analysis (EDA) of Titanic dataset using Pandas, Matplotlib, and Seaborn.

data-analysis eda matplotlib pandas python seaborn

Last synced: 10 Jun 2025