An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/kaz-yos/distributed

Comparison of Privacy-Protecting Analytic and Data-sharing Methods: a Simulation Study (Pharmacoepidemiol Drug Saf 2018)

data-analysis epidemiology statistics

Last synced: 15 Jun 2026

https://github.com/harshmule1/school-data-analysis-

School Data Analysis Using SQL

data-analysis mssql sql

Last synced: 04 Apr 2026

https://github.com/ensinho/data-analysis

My repository for data analysis studys in Python.

csv data-analysis graphs python python-documentation

Last synced: 15 Jun 2026

https://github.com/vyjayanthipolapragada/early_sepsis_detection_ml

An end-to-end project leveraging clinical datasets (PhysioNet, MIMIC-IV, MIMIC-IV-ED) to develop and compare ML and LSTM-based models for early sepsis prediction.

data-analysis data-visualization deep-learning healthcare jupyter-notebook keras-tensorflow lstm-neural-networks machine-learning neural-network python

Last synced: 28 Apr 2026

https://github.com/mengyaohuang/data-manipulation-and-analysis

Data processing implementation with tools in Python

data-analysis nlp-machine-learning pandas-dataframe python

Last synced: 27 Apr 2026

https://github.com/alexandrelamarre/fission

Data analytics & Structured streaming optimized for the Edge

data-analysis data-engineering rust structured-data unstructured-data

Last synced: 08 Jun 2026

https://github.com/gonzalo123/pivot.pandas

Data Analysis with Python. Pivot tables with Pandas

data-analysis jupyter-notebook pandas pivot-tables python

Last synced: 05 May 2026

https://github.com/com-480-data-visualization/project-2023-the-vizards

Lausanne Transportation : a data visualization of the Lausanne Transportation network. Developed by the Vizards team as part of the EPFL Data Visualization course project (COM-480).

buses data-analysis data-science data-visualization epfl lausanne map metro public-transport public-transportation switzerland webgl

Last synced: 01 May 2026

https://github.com/pheithar/socialdata_madridcentral

Social data and visualization course at DTU - 2022. Effectiveness of Madrid Central

data-analysis data-visualization jupyer-notebook madrid python

Last synced: 28 Apr 2026

https://github.com/snehilk1312/data_science

This Repository contains the Data Science things I have done in recent times along with visualization , cleaning , models, statistics, Courses, Datasets. :=)

data-analysis data-science glove natural-language-processing nlp nltk statistics word2vec

Last synced: 02 Apr 2026

https://github.com/scarblase/portfolioprojects

A collection of data analysis and business intelligence projects using SQL, Python, and visualization tools to uncover insights from real-world datasets. 🚀📊

csv data-analysis data-engineering data-mining data-science data-visualization matplotlib matplotlib-pyplot pandas python python3 seaborn sql

Last synced: 06 May 2026

https://github.com/garcane/london-housing-price-dashboard

This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.

data data-analysis data-visualization excel visual

Last synced: 13 Feb 2026

https://github.com/monish-nallagondalla/diamondpriceprediction

Diamond Price Prediction is an end-to-end machine learning project that predicts diamond prices based on attributes like carat, cut, color, clarity, and dimensions. It features a Flask web application for real-time predictions and utilizes models such as Linear Regression, Lasso, and Ridge.

data-analysis data-science flask jupyter-notebooks machine-learning predictive-modeling python

Last synced: 06 May 2026

https://github.com/affec-ds/netflix-recommender-system

Sistema de recomendación de títulos de Netflix basado en contenido. Incluye filtros por título, género y tipo de contenido (películas o series) con interfaz interactiva en Jupyter Notebook.

content-based-recommendation data-analysis eda ipywidgets jupyter-notebook machine-learning movies netflix portfolio-project python recommender-system

Last synced: 28 Apr 2026

https://github.com/eslamdyab21/imdb-data-analysis

This data set contains information about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue

data-analysis pandas python udacity-data-analyst-nanodegree

Last synced: 06 May 2026

https://github.com/chetanmalviya513/Firm-Financial-Transaction-Analysis

📊 Financial Analysis & Forecasting Processed large-scale financial data using Python for trend analysis and insights. Developed interactive Tableau dashboards to improve forecasting accuracy and reduce costs by 25%.

data-analysis financial-data forecasting insights msexcel pandas python reporting tableau-dashboards

Last synced: 15 Jun 2026

https://github.com/thevinh-ha-1710/diabetes-predictive-model

This project aims to train a predictive model to diagnose diabetes on women patients.

data-analysis data-science data-visualization model-training-and-evaluation python

Last synced: 13 Feb 2026

https://github.com/allanotieno254/powerbi-chocolate-sales-analysis-dax-calculations-80-

This Power BI project analyzes **chocolate sales performance using advanced DAX calculations and interactive visualizations. The report provides insights into monthly revenue, top-selling products, sales trends, and market performance.

business-intelligence data-analysis dax powerbi powerbi-dashboards powershell-module sales-analysis visualization

Last synced: 13 Feb 2026

https://github.com/sayantanidalui/indian-government-budget-analysis

A complete end to end data analysis project using Python, SQL, and Power BI based on a Kaggle dataset. Built to explore trends, allocations, and insights from India’s Union Budget (2021–24) for practice purposes.

data-analysis mysql pandas powerbi storytelling

Last synced: 07 May 2026

https://github.com/com-480-data-visualization/project-2023-choo-choo-data-darlings

This repository contains the source code for our data visualization project, an interactive platform designed to explore the intricate Swiss transportation network. Developed by the Choo Choo Data Darlings team at EPFL, the project provides an in-depth view into the vast array of Swiss transportation operations, including trains, buses, and trams.

boats buses data-analysis data-science data-visualisation data-visualization epfl metro public-transport public-transportation switzerland trains trams

Last synced: 01 May 2026

https://github.com/elcaiseri/udacity-advanced-data-analysis

UDACITY - Advanced-Data-Analysis Track Project

data-analysis python

Last synced: 05 May 2026

https://github.com/dsrodrigovieira/houserocketsales

Este repositório contém um projeto desenvolvido para praticar habilidades de análise de dados utilizando Python

data-analysis data-visualization heroku kaggle-dataset python

Last synced: 29 Apr 2026

https://github.com/wrighang/shipping-data-analysis

Independent Project: Transit time trends analysis following a major shipping process change.

data-analysis matplotlib numpy pandas python

Last synced: 18 Apr 2026

https://github.com/sivas-2/coffee-sales-visualization

This repository contains data visualization scripts and notebooks analyzing coffee sales data from a vending machine, sourced from Kaggle. The visualizations explore sales trends, customer preferences, and product popularity over time.

data data-analysis data-science data-visualization python visualization

Last synced: 07 May 2026

https://github.com/fybex/chatgpt-conversations-analysis

Analysis of 89,000 ChatGPT conversations to understand interaction patterns and response behaviors.

chatgpt conversation-analysis data-analysis data-visualization language-analysis prompt-patterns sentiment-analysis

Last synced: 02 May 2026

https://github.com/hebaqaisar/movie-recommender-system

AI Recommender System - Recommends you similar movies based on Directors, Tags, Name, Type, Actors, Genre etc

artificial-intelligence data-analysis data-mining data-science jupyter-notebook machine-learning machine-learning-algorithms ml movies-rate pycharm python

Last synced: 17 Apr 2026

https://github.com/pradipece/weather_forecast_data_analysis

Using decision trees and random forest algorithms to solve real-world data analysis. "sklearn_decision_trees_random_forests"

data-analysis data-science data-visualization git github python python3

Last synced: 19 Apr 2026

https://github.com/programmer-rd-ai/moviedatascraper

Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!

beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web

Last synced: 01 Mar 2025

https://github.com/albertomorini/policesviolence

Repository for the project of the course Data Science (Fondamenti di Scienza dei Dati) at UniUD.

data-analysis data-science data-visualization r

Last synced: 31 May 2026

https://github.com/Zen204/airbnb-availability

A machine learning model that predicts Airbnb listing availability, utilizing feature engineering and supervised learning techniques to improve guest experience and optimize host management.

binary-classification data-analysis data-preprocessing data-visualization feature-engineering machine-learning matplotlib model-evaluation nlp pandas predictive-modeling python scikit-learn seaborn supervised-learning

Last synced: 02 Apr 2025

https://github.com/robinmillford/analytics_for_fashion_supply_management

This Streamlit dashboard provides a comprehensive analysis of supply chain data, focusing on key metrics such as production volumes, stock levels, order quantities, revenue, manufacturing costs, lead times, shipping costs, transportation routes, risk factors, and sustainability factors

dashboard data-analysis data-visualization streamlit supply-chain-management

Last synced: 07 Sep 2025

https://github.com/aymane-maghouti/sentiment-analysis-for-jumia-reviews-and-smartphone-price-prediction-system

The project focuses on customer sentiment analysis for Jumia, aiding informed online decisions. It collects and analyzes product comments to determine sentiments and implements a decision-making algorithm. Additionally, it includes product price prediction system using regression techniques.

beutifulsoup data-analysis data-cleaning data-collection data-preprocessing data-scraping data-visualization eda falsk machine-learning python web-application

Last synced: 18 Apr 2026

https://github.com/saeun-park/lg-aimers-4th

MQL 데이터 기반 B2B 영업기회 창출 예측 모델 개발

b2b data-analysis data-science machine-learning mql

Last synced: 08 Apr 2025

https://github.com/mafda/seattle_airbnb_data_analysis

This repository contains a comprehensive analysis of the Seattle Airbnb dataset, conducted using the CRISP-DM (Cross Industry Standard Process for Data Mining) methodology.

crisp-dm data-analysis data-science jupyter-notebook pandas-python seattle-data

Last synced: 29 May 2026

https://github.com/solrikk/pictrace-web

PicTraceV2 is a highly efficient image matching platform that leverages computer vision using OpenCV, deep learning with TensorFlow and the ResNet50 model, asynchronous processing with aiohttp, and Selenium for browser automation. PicTraceV2 allows users to upload images directly or provide URLs, quickly scanning a vast database to find image

automation computer-vision data-analysis data-extraction deep-learning image-processing image-search machine-learning natural-language-processing opencv openpyxl pandas python selenium tensorflow web-scraping yandex yandex-api

Last synced: 12 Apr 2026

https://github.com/bishtrishu/pizza_sales_data_analysis_sql

This project is a comprehensive data analysis of pizza sales, aimed at uncovering key insights and trends to inform business decisions. Using a combination of SQL, Python, and data visualization tools, the project analyzes sales data to understand customer preferences, peak sales periods, and the most popular pizza types.

cloud data data-analysis data-science data-visualization dataanalytics database mysql oracle-database

Last synced: 14 Apr 2026

https://github.com/md-emon-hasan/1-simple-stock-price-ml-app

A simple mahcine learning application for stock prices, demonstrating data preprocessing, model training, and deployment using scikit-learn.

data-analysis data-science eda ml-app streamlit-webapp time-series time-series-analysis webapp

Last synced: 31 May 2026

https://github.com/chen0040/pyspark-advanced-algorithms

Samples of Advanced Algorithms and Data Analysis implemented in pyspark

advanced-algorithms data-analysis map-reduce pyspark

Last synced: 12 Jan 2026

https://github.com/jubinjacob03/heartdiseaseclassify-ml

Heart Disease Dataset Analysis & Classification using ML models such as linear, support vector machine, k-means, k-nearest neighbors and logistic regression.

data-analysis data-science data-visualization ipython-notebook kaggle-dataset kmeans knn linear-regression logistic-regression machine-learning matplotlib python seaborn support-vector-machine

Last synced: 18 Jan 2026

https://github.com/jossimmar/ensa-ss25

Repositorio destinado al manejo de datos de consumo de los Clientes Mayores de ENSA del Grupo Distriluz.

data-analysis electrical-engineering python sqlite

Last synced: 30 Mar 2025

https://github.com/phillbertnevinemmanuel/automotivesalesdataanalysis

This marks my inaugural venture into personal data analysis, employing SQL and Python for Correlation Analysis. I've sourced the dataset from Kaggle, specifically focusing on automotive sales. You can find the dataset linked on my website below. I'm excited to share that I've independently managed the majority of tasks involved in this project.

data-analysis dataset microsoft-sql-server python python-lambda sql ssms tsql

Last synced: 14 Mar 2026

https://github.com/as16082023/coffee-bean-sales-analysis

Analyzing coffee bean sales data to optimize consumer targeting, product offerings, and strategic marketing in the coffee industry.

coffee-bean-sales dashboard data-analysis data-visualization ms-excel

Last synced: 22 Jan 2026

https://github.com/lightbridge-ks/zoominterface

A data analysis Shiny app of program Zoom report files.

data-analysis r shiny-apps zoom-class zoom-meetings

Last synced: 01 Jun 2026

https://github.com/vi/rendercsv

Tool to convert CSV table to a picture.

animation csv csv2pic csv2png data-analysis picture png table table-renderer visualization

Last synced: 01 Apr 2025

https://github.com/sathyasris27/statistical-analysis-on-rehoming-time-for-different-dog-breeds-in-animal-shelter

The aim of this project is given a collection of records documenting the stray, unwanted, or neglected dogs sent to animal shelters to be rehomed, we analyse their rehoming patterns based on their breeds.

data-analysis r statistical-analysis statistical-inference statistical-models

Last synced: 05 Jun 2026

https://github.com/vatshayan/hospital-discharge-analysis

Analysis of Hospitalization Discharge Rates in Lake County, Illinois of various attributes like Anxiety, Alcohol, mood, Diabetes, Asthma, etc

data-analysis data-visualization jupyter-notebook machine machine-learning machine-learning-algorithms scikit-learn

Last synced: 04 Mar 2025

https://github.com/gallillio/unsupervised_clustering_music_recommendation_system

Music Recommendation System using Unsupervised Machine Learning Clustering Methods using K-Means, Fuzzy C Mean DBSCAN, Gaussian Mixture Model, BIRCH and Agglomerative Clustering

affinity-propagation agglomerative-clustering birch-clustering data-analysis data-visualization dbscan-clustering fuzzy-cmeans-clustering gaussian-mixture-models k-means-clustering pca unsupervised-machine-learning

Last synced: 19 Oct 2025

https://github.com/prajakta1321/kaggle-ai-report-2023

A Report describing the trends in emergence of AI over the years !

data-analysis data-visualization python3

Last synced: 28 Jun 2025

https://github.com/ssreeramj/youtube_channels_analysis

This web app gives a detailed analysis of the videos uploaded in a particular youtube channel.

data-analysis heroku pandas python streamlit youtube

Last synced: 29 Apr 2026

https://github.com/jpotter80/notebook-examples

This repository demonstrates a systematic approach to cleaning and standardizing e-commerce product data using DuckDB. The notebook serves as a detailed walkthrough of our data cleaning methodology, showcasing how we handle common data quality challenges in e-commerce datasets.

data-analysis data-cleaning jupyter-notebook

Last synced: 12 Jun 2025

https://github.com/nafisalawalidris/buybuy-e-commerce-company

The BuyBuy E-commerce Company repository is a comprehensive hub for the company's e-commerce platform. It includes source code, documentation, and data analysis insights, providing a data-driven approach to improve customer experience, drive revenue, and inform decision-making.

buybuy cleaning-data company customer-experience data data-analysis decision-making documentation e-commerce excel insights postgresql repository revenue source-code sql

Last synced: 16 Mar 2025

https://github.com/lobooooooo14/badwords-pt-br

💬 Wordlist com palavrões em pt-BR para análise de dados, filtros, ou texto considerado "evitável"

badword-filter badwords brasil data-analysis filter filter-lists filterlist portugues portuguese text-analysis wordlist

Last synced: 25 Mar 2025

https://github.com/nafisalawalidris/tools-for-data-science

It covers popular languages (Python, R, SQL) and libraries (NumPy, Pandas) used in the field. The author shares their objectives of teaching data analysis, web development, and critical thinking skills. The repository also includes code examples, explanations of arithmetic expressions, and contact information for the author.

arithmetic-expressions data-analysis data-science data-visualization languages libraries matplotlib numpy pandas programming python r sql tools web-development

Last synced: 11 Apr 2026

https://github.com/nafisalawalidris/building-a-clustering-model-for-customer-segmentation

Customer Segmentation Using Clustering: This repo applies clustering algorithms to a customer transaction dataset, grouping similar customers together based on their purchasing behavior. Targeted marketing strategies can be developed by analyzing distinct customer segments.

clustering customer-segmentation data-analysis data-visualization k-means machine-learning marketing-analytics unsupervised-learning

Last synced: 16 Mar 2025

https://github.com/moscarde/pyproductivity

Application uptime tracker that monitors active windows, automatically generating daily usage reports.

daily-report data-analysis python tracker

Last synced: 19 Oct 2025

https://github.com/prangonghose/analysis_of_bangladesh_economic_complexity

In this project a brief analysis has been done by our team in the export economy of Bangldesh for the past three decades.

data-analysis data-science data-visualization inequalipy matplotlib pandas plotly

Last synced: 22 May 2026

https://github.com/realorangeone/docker-cyberchef

A containerized deployment of CyberChef, with additional protections

cyberchef data-analysis data-manipulation docker encoding

Last synced: 24 Aug 2025

https://github.com/devexpress-examples/aspnet-pivot-grid-custom-aggregates

This example shows how to aggregate data by the field's first value.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 06 Jul 2025

https://github.com/nafisalawalidris/sales-performance-dashboard

Sales Performance Dashboard: Analyze and visualize sales data using Power BI. Gain insights into trends, customer segments, product performance, and geographic distribution. Make data-driven decisions to optimize sales strategies and maximize revenue.

analytics-revenue dashboard-power-bi data data-analysis intelligence-sales optimization performance sales visualization-business

Last synced: 03 Feb 2026

https://github.com/rayyan9477/household-transactions-analysis-and-clustering

This project involves analyzing household transaction data to gain insights into spending patterns and behaviors. The analysis includes data cleaning, exploratory data analysis (EDA), clustering using K-Means, and visualization of customer segments.

customer-segmentation data-analysis data-cleaning data-science exploratory-data-analysis kmeans-clustering machine-learning

Last synced: 27 Feb 2025

https://github.com/vitia-fritelle/analise_dieese

Análise realizada com base nos dados extraídos do site https://www.dieese.org.br/analisecestabasica/salarioMinimo.html

data-analysis economic-data

Last synced: 09 Apr 2025

https://github.com/ahmednasef3/udemy-courses-full-eda

Exploratory Data Analysis on the factors that can affect the promotions and earnings in Udemy Courses and the perfect way to make a good saled course in Udemy.

data-analysis data-science data-visualization eda exploratory-data-analysis matplotlib pandas seaborn udemy-course-project

Last synced: 01 May 2026

https://github.com/beallio/wherewolf

Wherewolf is a production-grade, local SQL workbench designed for data engineers and analysts to query local files (CSV, Parquet, JSON) with ease. Built with Streamlit, it provides a unified interface to execute SQL against either DuckDB or PySpark engines without requiring complex setup.

big-data data-analysis data-engineering etl parquet performance pyspark python spark-sql sql uv

Last synced: 28 Apr 2026

https://github.com/viztruth/google-play-store-data-analysis

This repository contains all the materials of my final project 'Google Play store Data Analysis' for the 'Telling Stories with Data' course at PES University.

data-analysis data-visualization

Last synced: 21 Aug 2025

https://github.com/akankshaaa013/30-day-machine-learning-deep-learning

To practically Learn, Explore, and Share my Insights on the Libraries and Tools that power Machine Learning.

data-analysis machine-learning python

Last synced: 15 Mar 2025

https://github.com/garcane/global-shipping-analytics-dashboard

This Tableau project provides a comprehensive visual analysis of global sales, shipping costs, and quality metrics across different regions and countries.

data data-analysis data-analyst data-visualization metrics tableau

Last synced: 01 Mar 2026

https://github.com/renanmoliveir/analise_de_dados_bikestore_power-bi_atualizan-o

Projeto de análise de dados do banco de dados Bike Store com Power BI.

data-analysis dax-languague powerbi query

Last synced: 15 Mar 2026

https://github.com/ayaanjawaid/google_playstore_data_analysis

This project provides an in-depth analysis of Google Play Store apps and user reviews, focusing on understanding app performance, user sentiment, and key trends in app categories. Using Python, I performed data cleaning, feature engineering, and exploratory data analysis (EDA) on app data and reviews.

data-analysis eda html numpy pandas-dataframe plotly python vizualisation

Last synced: 24 Feb 2026

https://github.com/emredurukn/data-analysis

Example notebooks for analyzing data

data-analysis data-visualization python

Last synced: 12 May 2026

https://github.com/5ekastanx/data-analysis

Extracting data from parsing, for example, like hacking using Python using all sorts of function methods

data-analysis html python

Last synced: 14 Mar 2025

https://github.com/lmuffato/analise-de-diarias-prefeituas-do-es

Esse código faz parte de um projeto de descoberta e combate a esquemas de corrupção, através do tratamento e cruzamento de dados abertos disponíveis em diversas prefeituras do Espirito Santo através do portal da transparência. Junção e análise de várias tabelas importadas em csv.

data-analysis personal-project r rstudio

Last synced: 12 Jun 2025

https://github.com/jatin-mehra119/bike-rentals-dataset

This repository focuses on optimizing bike rental availability during peak hours and days using machine learning techniques. Leveraging publicly available data from the UCI Machine Learning Repository, it includes scripts for data preprocessing, model training, and visualization, along with detailed observations and results.

data-analysis data-science ensemble-model pandas scikitlearn-machine-learning

Last synced: 15 Apr 2026

https://github.com/gattiharishkumar/blinkit-sales-analysis-dashboard

This project presents a comprehensive sales analysis dashboard for Blinkit, an Indian last-minute delivery app. The dashboard was created using Power BI and provides a detailed overview of the company's sales performance across various outlets and product categories.

dashboard data-analysis data-transformation data-visualization ms-excel-data-analytics power-query powerbi powerbi-visuals

Last synced: 19 Mar 2026

https://github.com/cano1998/eda-survival-of-the-titanic

This project focuses on Exploratory Data Analysis (EDA) to identify the key determinants that influenced survival during the infamous Titanic accident.

data-analysis data-cleaning data-preprocessing data-visualization exploratory-data-analysis jupyter-notebook titanic-survival-exploration

Last synced: 21 Jun 2026

https://github.com/gappeah/global-shipping-analytics-dashboard

This Tableau project provides a comprehensive visual analysis of global sales, shipping costs, and quality metrics across different regions and countries.

data data-analysis data-analyst data-visualization metrics tableau

Last synced: 25 Feb 2025

https://github.com/ndohvich/ibm-data-science-professional-certificate

Kickstart your career in data science & ML. Build data science skills, learn Python & SQL, analyze & visualize data, build machine learning models. No degree or prior experience required.

coursera dash data-analysis data-science html5 ibm ibm-professional-certificate javascript machine-learnng python sql

Last synced: 16 Nov 2025

https://github.com/soumya-thoutam/revenue-and-demand-forecasting-analysis

End-to-end data analysis project with SQL and Power BI for revenue and demand forecasting in bike-sharing.

data-analysis data-visualization powerbi sql sql-server

Last synced: 16 Mar 2025