An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/hemangsharma/hotel-revenue-booking-analysis

This project provides a comprehensive revenue and reservation analysis for Highfield Hotel using historical data exported from booking systems and internal revenue reports. The goal is to derive actionable insights to improve room profitability, understand booking patterns, and support data-driven decision-making.

analysis data-analysis data-visualization hotel

Last synced: 10 Aug 2025

https://github.com/dcostachar/telco-customer-churn-dashboard

An interactive Tableau dashboard using the Telco Customer Churn dataset to analyze key drivers of customer churn and develop data-driven retention strategies for the telecommunications industry.

business-intelligence customer-churn-analysis data-analysis data-visualization marketing-analytics tableau

Last synced: 09 Mar 2026

https://github.com/myles/notebooks

Some of my random Jupyter Notebooks.

data-analysis data-science jupyter-notebooks

Last synced: 18 Jan 2026

https://github.com/surbhi242singh/pizza_sales_project

Used SQL to analyze pizza sales data

data-analysis mysql pizza-sales sql

Last synced: 07 Oct 2025

https://github.com/marcus-v-freitas/media_movel_covid19

Estudo de série temporal com média móvel de casos e óbitos de Covid19 no munícipio de São Paulo

covid-19 data-analysis data-science government-data jupyter-notebook moving-average plotly python sao-paulo temporal-series

Last synced: 17 Jan 2026

https://github.com/prajjwol09/sql_retail_analysis_project

This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.

data data-analysis datacleaning dataexploration pgadmin4 sql

Last synced: 15 Feb 2026

https://github.com/brunomontezano/digital-interventions-for-depression

📱 "Digital interventions for depressive symptoms: a randomized clinical trial" code

academia clinical-trials cognitive-behavioral-therapy data-analysis digital-health open-science smartphone-app

Last synced: 03 Oct 2025

https://github.com/donmaruko/python-eda-toolkit

CLI-runned EDA with 30 commands utilizing text-related functions, statistical calculations, data visualization, and data manipulation.

data data-analysis data-science data-visualization matplotlib pandas scipy seaborn statistical-analysis statistics wordcloud

Last synced: 06 May 2026

https://github.com/iamrajmani/sentimental-analysis

Sentimental Analysis - Final Year College Project

data-analysis data-visualization machine-learning python pytorch

Last synced: 06 May 2026

https://github.com/abhigyan126/prompt2query

A Python desktop application for streamlined data analysis, enabling users to generate and execute Pandas and SQL queries with ease. Focus on reducing analysis time through an intuitive interface and efficient workflows

data-analysis data-science data-visualization database gemini generative-ai ide llm pandas pandas-interface python sql-interface

Last synced: 13 Feb 2026

https://github.com/kmranrg/bikeshare

a project based on Data Analysis

data-analysis python

Last synced: 08 Oct 2025

https://github.com/sorebit/pdrpy-pd-2

Data analysis of various stackechange.com archives.

data-analysis stackexchange time-travel university-project

Last synced: 08 Oct 2025

https://github.com/tyriek-cloud/power-bi-nyc-housing-financial-report

This report was conducted to provide a comprehensive analysis of various NYC housing and financial data.

dashboard data-analysis data-visualization financial-analysis powerbi statistics

Last synced: 21 Jan 2026

https://github.com/amish5ingh/cricket-data-analytics-ipl

Data analysis and visualization of IPL 2022 matches using Python, Pandas, Matplotlib, and Seaborn. Includes insights on match outcomes, player performances, toss trends, and venue stats with 12+ charts.

data-analysis data-visualization ipl-data-analysis ipl-data-visualization jupiter-notebook matplotlib-pyplot numpy pandas python seaborn

Last synced: 09 May 2026

https://github.com/yashpaneliya/bank-loan-default-analysis

Analyze and understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default.

data-analysis loan-default-analysis matplotlib numpy pandas python

Last synced: 06 May 2026

https://github.com/marianamartiyns/api-logisticregression

Data analysis, modeling, and deployment of a logistic regression model for churn prediction, integrating a FastAPI backend and a Streamlit frontend.

data-analysis data-science fastapi logistic-regression pyhton streamlit

Last synced: 29 Apr 2026

https://github.com/busradeveci/odev2-branching

This project is prepared for Artificial Intelligence and Technology Academy Git GitHub Assignment 2. Using the “Wine Reviews” dataset from Kaggle, it converts wine ratings into star ratings and analyzes them.

data-analysis kaggle-dataset python wine-reviews-dataset

Last synced: 03 Oct 2025

https://github.com/priyanshubiswas-tech/priyanshubiswas-tech

SWE-Data Engineer @ EDN | Kubeflow-MLOps | Kubernetes | Databricks | AWS EMR-Lambda-Glue, Eventbridge, SQS-SNS | OCI Multi-Cloud Architect Professional | GCP GA4 | Gen AI | IEEE Brand Amb. | Ex-Chair, PES | Ex-Sec, SB

apache-spark aws data-analysis data-engineering data-visualization dbt hadoop kubernetes python3 sql

Last synced: 21 Jan 2026

https://github.com/atiqisrak/py

This repository houses the code and resources for the **100 Days of Python Challenge** – an intensive learning journey designed to propel you from beginner to a a confident Python programmer in just 100 days.

data-analysis data-science machine-learning python3

Last synced: 10 Oct 2025

https://github.com/ibromeat/road-accident-risk

Exploratory Data Analysis of road accident risk predictions — visualizing model stability and distribution of predicted probabilities.

data-analysis jupyter-notebook matplotlib python traffic-data visualization

Last synced: 18 May 2026

https://github.com/anandu-jpg/coffee-shop-sales-analysis

This project analyzes coffee shop sales data to identify trends, patterns, and insights that can help improve operations, boost revenue, and enhance the customer experience.

business-intelligence data-analysis data-visualization exploratory-data-analysis jupyter-notebook pandas phyton

Last synced: 18 May 2026

https://github.com/ankitwalimbe/sentiment-analysis

Sentiment analysis of Amazon Fashion reviews using VADER and a baseline ML model (TF-IDF + SGDClassifier). Includes visualizations, reproducible notebook, and recruiter-ready documentation.

data-analysis machine-learning matplotlib nlp pandas python seaborn sentiment-analysis sklearn

Last synced: 06 May 2026

https://github.com/sharvesh1401/battsense

BattSense is a machine learning project focused on predicting the State of Health (SOH) of lithium-ion batteries using operational parameters such as voltage, current, temperature, and capacity. The model enables accurate, data-driven diagnostics for battery performance monitoring in electric vehicles and portable devices.

battery-diagnostics battery-health battery-health-prediction battery-soh data-analysis electric-vehicles energy-storage machine-learning predictive-maintenance python regression scikit-learn

Last synced: 07 May 2026

https://github.com/its-ekanshi/sql-analytics-project

Designed relational tables with primary and foreign keys, populated with sample data for real-world testing. Implemented advanced SQL techniques such as CTEs, window functions, aggregates, and filters to extract valuable insights.

business-intelligence data-analysis exploratory-data-analysis microsoft-sql-server sql sql-queries

Last synced: 10 Oct 2025

https://github.com/mikma03/datascience_python_datacamp

DataScience with Python. Code and examples. Python libraries, including pandas, NumPy, Matplotlib, and many more.

data-analysis data-science datacamp datascience numpy pandas python

Last synced: 06 May 2026

https://github.com/frankelavsky/security-dash-challenge

I had two 8 hour days to create a visualization dashboard for three datasets. Tab one: Voronoi overlay on line graph. Tab two: Data partitioning method keeps in-memory usage low. Tab three: deals with "Failed" vs "Successful" attempts as positive/negative barcharts over time. I used d3.js, require, MVC pattern, and vanilla js.

client-side complexity css3 d3 d3js dashboard data-analysis data-structures-algorithms data-visualization frontend-app html5 interactive-visualizations javascript modular network-analysis network-monitoring network-security security single-page-app visualization

Last synced: 14 Apr 2026

https://github.com/scarlet-enlight/ml_project

Comparison of different classifiers (KNN, Naive Bayes, Decision Tree) on Sleep Health and Lifestyle Dataset

data-analysis machine-learning

Last synced: 13 Mar 2026

https://github.com/priyanshubiswas-tech/deloitte-daikibo-telemetry-analysis-task-1

Tableau dashboard analyzing Daikibo telemetry data. Tracks downtime by factory/device with interactive filters. Deloitte task solution with JSON processing.

data-analysis data-visualization deloitte json tableau tableau-public

Last synced: 11 Oct 2025

https://github.com/jiwookseo/natural_language_analysis

api sample for google natural language and ECOS(한국은행 경제통제시스템)

data-analysis google-natural-language-api text-analysis

Last synced: 11 Oct 2025

https://github.com/mouadtaoussi/capmpingi-employee-reviews

Analysis of Capmpingi employee reviews using Python/Pandas and Power BI

data-analysis data-science kaggle pandas powerbi python python3

Last synced: 14 Apr 2026

https://github.com/dhruvil-26/tableau-projects

This repository contains Tableau visualization projects focused on data analysis across different domains. Projects include: 1. IPL Visualization - Insights into IPL match, Team and player statistics. 2. EV Analysis - Visualizations exploring the adoption of electric vehicles. 3. Road Accident Analysis - Analysis of road accident patterns

analysis data data-analysis data-analytics electric-vehicles ipl road-accident-analysis tableau tableau-public

Last synced: 19 Jan 2026

https://github.com/harryrlk/data_analysis_showcase

This repository showcases my data analysis and visualization projects using Excel, Python, R, and Tableau. Some projects are under NDA, so key figures and specific numbers are not included, but brief overviews and methodologies are provided. Feel free to explore and contact me for further details.

data-analysis data-science data-visualization excel portfolio python r tableau

Last synced: 06 May 2026

https://github.com/v41bh4vr4jput/data-analysis-with-python

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.

api data data-analysis data-visualization matplotlib numpy pandas python sakila-db seaborn

Last synced: 09 Apr 2026

https://github.com/kishorep26/school-recommendation-system

Intelligent school recommendation system that matches students with suitable educational institutions based on preferences and performance metrics

bootstrap data-analysis decision-support edtech education education-technology flask matching-algorithm python recommendation-system school-finder school-search student-portal web-application

Last synced: 06 May 2026

https://github.com/sebastianofazzino/ibm-data-science-professional-certificate

In this repository I've stored exercises and projects I've been working on while attending IBM Data Science Professional Certificate, using Python and its libraries.

data-analysis data-mining data-science data-structures data-visualization database machine-learning matplotlib numpy pandas python regression seaborn sql

Last synced: 09 Apr 2026

https://github.com/abeltavares/postql

Python library and command-line interface (CLI) tool for interacting with PostgreSQL databases, providing simplified database management, query execution, and result export functionalities.

cli command-line-interface data-analysis data-engineering data-export data-management data-processing data-visualization database database-administration database-tools etl oop postgres postgresql psycopg2 python sql sqlalchemy wrapper

Last synced: 19 Jan 2026

https://github.com/treasarose/us_candy_distribution_analysis_project

This project focuses on advanced data analysis and optimization using SQL. It includes queries for analyzing sales, product margins, and shipping efficiency for a US candy distributor.

data-analysis entity-relationship mssql optimization query sql-server sqlproject us-candy-distributor

Last synced: 12 Oct 2025

https://github.com/jeffbrennan/analysis-templates

Templates of commonly used graphics/functions/settings to help focus on the bigger picture

data-analysis r rmd

Last synced: 12 Oct 2025

https://github.com/theashishmavii/job-trends-analyzer-automation

End-to-end automation: job scraping, data analysis, and trends reporting for job seekers and researchers.

automation beautifulsoup data-analysis open-source pandas python selenium webscraping

Last synced: 07 Aug 2025

https://github.com/chirlmin-joo-lab/papylio

Single-molecule fluorescence trace extraction and analysis

biophysics data-analysis fluorescence fret single-molecule sparxs

Last synced: 12 Oct 2025

https://github.com/PanosChatzi/Healthcare_and_Bioinformatics_Analyses

This repo contains the final assignments of the Data Analyst bootcamp by Workearly. Python and SQL were used to complete the assignments.

data-analysis data-cleaning data-visualisation jupyter matplotlib pandas python seaborn

Last synced: 05 Aug 2025

https://github.com/rahulsm20/car-data

A data analytics project that involves analyzing a car dataset that includes information on various car brands, years, prices, mileage, and fuel types, in order to gain insights into the car market.

data-analysis data-analytics matplotlib numpy pandas python

Last synced: 09 Apr 2026

https://github.com/mhkamel/ecommerce-targeting-system

A Flask-based E-Commerce Targeting System that provides customer segmentation and personalized product recommendations. Users can upload structured interaction data for analysis, receive AI-driven recommendations, and gain insights into user behavior. The application is built with Flask, Pandas, Scikit-Learn, and integrates an interactive web inter

ai bootstrap csv-processing customer-segmentation data-analysis data-science e-commerce flask machine-learning pandas python recommendation-system scikit-learn user-behavior web-application

Last synced: 09 Apr 2026

https://github.com/0xnu/england-house-prices

Predict house prices for the next five years across all English local authorities.

data-analysis england england-house-prices housing-market housing-market-analysis predictive-modeling regression

Last synced: 03 Aug 2025

https://github.com/prasannnnn/real-time-share-price-scraping-and-analysis

The Stock Sentiment Analyzer is a web-based application built with Streamlit, BeautifulSoup, and Pandas to help users analyze the sentiment of a stock (BUY, SELL, or HOLD) based on its financial data. The tool extracts key financial metrics like Market Cap, Stock P/E, Dividend Yield, ROCE, ROE, and the 52-week High/Low from Screener.in.

beautifulsoup4 data-analysis python sentiment-analysis streamlit streamlit-dashboard webscraping

Last synced: 03 Aug 2025

https://github.com/lc-rezende/eqx_boston_dataset

Exploratory data analysis, clustering, and forecasting on Boston crime data (2011-2015), revealing key crime trends, hotspots, and temporal patterns to support data-driven insights for urban safety and policing strategies.

data-analysis exploratory-data-analysis jupyter-notebook kmeans matplotlib numpy pandas prophet-facebook python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 21 Jan 2026

https://github.com/thedevreda/jadaerospace

A Real life project showing how to improve selling aircraftparts and helping salers to focus more on effective products at JadAero

data data-analysis data-cleaning data-visualization jupyter-notebook powerbi python

Last synced: 02 Aug 2025

https://github.com/yamslam/contentsunderpressure_processing

A repository for data processing and analysis for Contents Under Pressure.

data-analysis data-processing data-visualization game-based-learning judgments process-safety

Last synced: 07 Sep 2025

https://github.com/quesocosteno03/data-analysis-projects

This repository serves as a collection of all my projects.

data-analysis jupyter-notebook powerbi

Last synced: 02 Aug 2025

https://github.com/josepablodmg/python--linear-regression-advertising

A linear regression analysis to predict sales based on advertising spending across TV, radio, and newspaper channels. The project includes exploratory data analysis, model training, coefficient visualization, and residual analysis.

advertising data-analysis exploratory-data-analysis linear-regression machine-learning python regression scikit-learn visualization

Last synced: 06 May 2026

https://github.com/samkazan/business-analysis-tableau

Business Analysis on Global/Superstore data using Tableau.

analysis data-analysis tableau visualization

Last synced: 08 Feb 2026

https://github.com/jotstolu/netflix-sql-data-analysis-project

This project explores the Netflix dataset using SQL queries to uncover trends, patterns, and business insights that could help stakeholders understand content distribution, viewer preferences, and platform optimization

data-analysis sql sql-server tsql

Last synced: 02 Aug 2025

https://github.com/dlozeve/topological-persistence

Topological persistence diagram (barcode) of a triangulation

data-analysis persistence topology

Last synced: 02 Aug 2025

https://github.com/ankitpoddar07/excel-project_back-office

📊 Coffee Sales Analytics – Back Office Excel Project

data-analysis ms-excel

Last synced: 05 Feb 2026

https://github.com/ashwin331133/powerbi-data_professional_survey_breakdown

This project analyzes survey data from individuals interested in transitioning to the data field. The survey aims to understand their backgrounds, motivations, and the challenges they face. Using Power BI for data visualization, the project provides insights into the demographics and preferences of these aspirants.

data-analysis data-visualization powerbi

Last synced: 03 Jan 2026

https://github.com/analyst-lochan/flight-delay-and-cancellation-dataset-2019-2023-

This project demonstrates a complete data analytics pipeline starting from raw real-world flight data to professional visual dashboards using SQL Server and Power BI. It showcases data import, cleaning, optimization, transformation, and dynamic DAX-based visual reporting.

airline-performance business-intelligence data-analysis data-cleaning data-modeling data-visualization dax etl flight-data kaggle-dataset portfolio-project powerbi powerbi-dashboard sql sql-server

Last synced: 09 Sep 2025

https://github.com/saisurajmatta/e-commerce-sales-advanced-data-analysis

Excel-based e-commerce analytics for FNP, a gift company. It covers data extraction, modeling, and visualization, providing actionable insights on revenue, customer behavior, and operations. Key skills include Excel, Power Query, Power Pivot, and DAX. The analysis culminates in data-driven business recommendations.

data-analysis data-visualization dax excel power-pivot power-query

Last synced: 22 Jan 2026

https://github.com/jasoncobra3/finops-copilot

An end-to-end AI-powered FinOps platform that ingests cloud billing data, analyzes cost trends, answers natural-language questions using a RAG pipeline (LangChain + FAISS + sentence-transformers + Groq), and provides actionable cost optimization recommendations. Includes a FastAPI backend and Streamlit dashboard UI - fully containerized with Docker

ai-assistant cloud-cost-optimization cloud-enginee cost-analytics data-analysis devops docker faiss faiss-vector-database fastapi finops groq langchain llm pandas rag rag-pipeline sentence-transformers sqlite3 streamlit

Last synced: 13 Apr 2026

https://github.com/sanjayankur31/20181206-neurofedora

Slides for my NeuroFedora seminar at the UH Biocomputaiton group's weekly seminar

computational-neuroscience data-analysis neurofedora neuroimaging neuroscience open-science

Last synced: 19 Feb 2026

https://github.com/fbarffmann/home_sales

Analyzed 25,000+ home sales using PySpark and SparkSQL. Identified pricing trends by year built, home features, and view rating. Optimized query run-time by 70% using caching.

aws big-data data-analysis home-sales parquet pyspark python spark spark-sql sql

Last synced: 06 May 2026

https://github.com/aishwaryahastak/ipl_analysis

Analysis of IPL dataset using PySpark

data-analysis mllib pyspark

Last synced: 16 Oct 2025

https://github.com/preetesh21/spotme

This repository is using the web-based API provided by Spotify to retrieve data and then analyse it.

api data-analysis

Last synced: 18 Jun 2026

https://github.com/supertetelman/coursera-exdata-09

This repo contains several R scripts that were used to analyze, plot, and clean data from various datasets. These projects were part of the Coursera course, Exploratory Data Analysis. The end results of the analysis are included.

big-data course coursera data-analysis r

Last synced: 16 Oct 2025

https://github.com/aygp-dr/claude-log-stream

Advanced analytics engine for Claude Code logs with real-time processing capabilities

claude-api clojure data-analysis monitoring

Last synced: 24 Sep 2025

https://github.com/pauliorandall/airline-passenger-satisfaction-r

Analysing the Airline Passenger Satisfaction dataset from Maven Analytics

data-analysis data-analytics r

Last synced: 01 Aug 2025

https://github.com/mindlessmuse666/iris-ml-based-on-decision-trees

Проект демонстрирует применение моделей машинного обучения на основе деревьев решений и случайного леса для классификации набора данных Iris. Включает в себя загрузку данных, обучение моделей, оценку производительности и визуализацию результатов. Предназначен для изучения основ машинного обучения и анализа данных.

classification data-analysis data-visualization decision-trees iris-dataset machine-learning model-evaluation python random-forest scikit-learn

Last synced: 17 Oct 2025

https://github.com/pizofreude/da-with-r

Data analysis with R data centric programming language

data-analysis r

Last synced: 17 Oct 2025

https://github.com/prateek5525/online-shopping-analytics-project

The Online Shopping Analytics Project analyzed product trends, and regional sales using SQL and Tableau. Insights from the Sales and Location Dashboards highlighted key trends in demographics, product popularity, and regional performance. These findings empower businesses to optimize strategies, enhance marketing, and improve inventory management.

data-analysis excel kaggle-dataset sql tableau

Last synced: 20 Feb 2026

https://github.com/farrelfaricaf/exploratorydataanalyst---titanic

This project analyzes the Titanic dataset using exploratory data analysis (EDA) and visualization techniques to identify survival patterns. The goal is to understand how demographic factors like gender and age influenced survival rates during the 1912 disaster.

data data-analysis data-science data-visualization eda python titanic-dataset

Last synced: 31 Jul 2025

https://github.com/ayeshathoi/simulation-sessional-412

Simulation of SSQS, Inventory System, Transient State, PERT, Monte Carlo Alo etc.

data-analysis excel inventory-system monte-carlo python simulation ssqs triangle-distributions

Last synced: 31 Jul 2025

https://github.com/Kaushik-Puttaswamy/Airline-Passenger-Referral-Prediction-Using-Machine-Learning

This project uses a machine learning model to predict if passengers referred by existing customers will book a flight, helping airlines target likely customers. Key factors like service ratings and value for money drive predictions, achieving over 90% accuracy.

airline-marketing customer-referral-prediction customer-satisfaction data-analysis feature-engineering hyperparameter-tuning machine-learning model-evaluation predictive-analytics

Last synced: 20 Oct 2025

https://github.com/teamtigers/echartify

A web application built with .net core 2.2 that has come with the idea of reading the National Election's Data-set of Bangladesh in a fastest possible time and then representing the data-set with different statistical charts.

bangladesh chartjs code-first-migration cross-platform data-analysis data-structures data-visualization dotnet-core election-analysis election-data entity-framework-core materializecss mvc npoi razor-pages

Last synced: 16 Apr 2026

https://github.com/mothraa/etl-marketanalysis-webscraping-poo

OC project 2 refactoring (POO version not yet completed)

data-analysis etl poo python web-scraping

Last synced: 20 Oct 2025

https://github.com/sanveed-adnan/supermarket-sales-sql-project

SQL-based data analysis project on supermarket sales performance using SQLite and Power BI.

business-intelligence data-analysis data-science data-science-projects data-visualization power-bi sales-data sql sqlite

Last synced: 08 Nov 2025

https://github.com/nathadriele/transaction_fraud_prevention_pipeline

Uma solução de detecção e prevenção de fraudes em transações financeiras, combinando Machine Learning, regras de negócio e análises estatísticas avançadas. O sistema oferece um dashboard interativo para monitoramento em tempo real, análise de dados e gestão de alertas de fraude.

data-analysis data-visualization docker fraud-prevention machine-learning matplotlib numpy pandas pipeline pytest python scikit-learn scipy seaborn streamlit tensorflow transaction xgboost

Last synced: 10 Apr 2026

https://github.com/badranalyst/exploratory-data-analysis-on-salaries-dataset

Performing EDA on a dataset related to salaries, exploring relationships between factors like job titles, industries, and locations. Insights are visualized with plots to identify trends and disparities in salary data.

data-analysis dataset eda exploratory-data-analysis pandas python

Last synced: 07 May 2026