An open API service indexing awesome lists of open source software.

Data visualization

Data visualization is the visual depiction of data through the use of graphs, plots, and informational graphics. Its practitioners use statistics and data science to convey the meaning behind data in ethical and accurate ways.

https://github.com/r-mahesh45/fraud-detection-and-sales-analysis-using-random-forest

This project uses Random Forest to classify fraud risk based on taxable income and analyze key factors driving high sales for a cloth manufacturing company.

classification data-visualization extract-transform-load python3 random-forest

Last synced: 30 Apr 2026

https://github.com/angchekar28/air-quality-index-analysis

This project analyzes Air Quality Index (AQI) data to identify pollution trends, seasonal variations, and the impact of different pollutants. It includes data visualization, correlation analysis, and insights into air quality variations over time.

data-analysis data-science data-visualization exploratory-data-analysis jupyter-notebook machine-learning python

Last synced: 30 Apr 2026

https://github.com/prince-pastakiya/human-resources-tableau-project

👥 Interactive Tableau dashboard for HR analytics — includes workforce overview, demographics, income analysis, and detailed employee records with full filtering.

chatgpt data-analysis data-visualization human-resources numpy python python-faker tableau-dashboards tableau-public

Last synced: 18 Apr 2026

https://github.com/devprnvk/realestateml

This Python program analyzes a dataset (HousePricePrediction.xlsx) containing information about house prices. It utilizes pandas for data manipulation, matplotlib for plotting, and seaborn for visualizing correlations and distributions.

data-science data-visualization datasets houses npm plotting prediction-model seaborn

Last synced: 30 Apr 2026

https://github.com/tashi-2004/global-ecommerce-retail-trends-analysis

The Global E-commerce & Retail Analysis project involves data preprocessing, dimensionality reduction with PCA, CLV calculation and What-If analysis . Key insights include effective PCA for data reduction, detailed CLV analysis across segments , and the impact of pricing strategies on sales.

boxplot clv-analysis data-science data-visualization dataintegration deep-learning dimensionality-reduction ecommerce heatmap machine-learning normalization outlier-detection outlier-removal pca-analysis preprocessing python scatter-plot whatif-analysis

Last synced: 30 Apr 2026

https://github.com/dina-hosny/import-preprocess-and-visualize-a-dataset-project

A simple project to practice importing a dataset, data cleaning and preparation processes, and visualize the results to answer some given questions.

data-cleaning data-engineering data-science data-visualization jupyter-notebook matplotlib numpy pandas python

Last synced: 30 Apr 2026

https://github.com/fernandogomesfg/sabores-aromas-analytics

Projecto Sabores & Aromas: um dashboard interativo desenvolvido no Power BI, focado em insights de vendas, desempenho por equipe e análise de rentabilidade para optimizar decisões estratégicas.

analise-de-dados data-science data-visualization dataanalytics powerbi storytelling-with-data vendas

Last synced: 13 Feb 2026

https://github.com/mxagar/eda_fe_summary

An 80/20 guide for Data Processing: Data Cleaning, Exploratory Data Analysis, Feature Engineering, Feature Selection.

data-analysis data-cleaning data-modeling data-science data-visualization eda exploratory-data-analysis feature-engineering feature-selection machine-learning pandas

Last synced: 30 Apr 2026

https://github.com/fernandesotero/project-data-exploration

Student Performance Prediction with Data Science

data-visualization jupyter-notebook python

Last synced: 30 Apr 2026

https://github.com/cagandemirmr/airbnb_available_houses

In this repo, i create dashboard using Tableau.In this process, i use SQL and Python languages.

dashboard data-visualization dataprocessing python sql tableau

Last synced: 30 Apr 2026

https://github.com/srinibas-masanta/ibm-applied-data-science-capstone

This repository contains the work completed for the Applied Data Science Capstone Project offered by IBM on Coursera. The capstone project is the final course in the IBM Data Science Professional Certificate series and serves as an opportunity to apply the skills and knowledge gained throughout the series to a real-world data science problem.

capstone-project data-analysis data-science data-visualization machine-learning python web-scraping

Last synced: 30 Apr 2026

https://github.com/master-helix/ibm-data-analyst-certification-stock-analysis-project

This is a mini project repository of my IBM Certification involving stock analysis and plotting of Tesla and GameStop

analytics data data-analysis data-visualization ibm matplotlib pandas python web-scraping

Last synced: 09 May 2026

https://github.com/priyam-hub/covid-19-data-analysis

Explore COVID19 case numbers and deaths related to Coronavirus outbreak 2019/2020 in Pandas and in Jupyter notebook

analysis data data-visualization jupyter-notebook machine-learning python

Last synced: 08 Jun 2026

https://github.com/prishabhanot/skin_cancer_classification_model

Classifies 7 types of skin cancer lesions using a deep learning CNN model. Processes and balances the dataset, trains the model, and evaluates its accuracy with visualizations.

cnn confusion-matrix data-visualization keras machine-learning medical-imaging python tensorflow

Last synced: 09 May 2026

https://github.com/tolumie/aviva-insurance-statistics-hypothesis-abtesting-modelling

This project explores the impact of demographic and lifestyle factors on insurance charges. Using statistical hypothesis testing (ANOVA, Chi-Square, T-tests) and predictive modeling (Elastic Net, Random Forest, Gradient Boosting). The analysis is deployed using Streamlit.

anova chi-square-test data-visualization eda gradient-boosting hypothesis-testing insurance-dataset machine-learning predictive-modeling python random-forest statistical-analysis streamlit

Last synced: 30 Apr 2026

https://github.com/rayxiang03/indeed-job-scraping

Python toolkit for scraping Indeed job listings, preprocessing data, and generating visualizations for market analysis.

cloudscraper data-visualization indeed job-analysis nlp pandas python web-scraping

Last synced: 30 Apr 2026

https://github.com/nahiyanhkhan/data-processing-and-visualization

Loan Data Processing using Python's numpy and pandas libraries. For data visualization, matplotlib and seaborn are used.

data-analysis-python data-visualization matplotlib numpy pandas seaborn

Last synced: 09 May 2026

https://github.com/mitchellharrison/mitchellharrison.github.io

Welcome to my slice of the internet, where I share the knowledge that Duke gave me, so you don't have to spend the mortgage-sized amount to access it. Built with R, Python, Quarto, and love.

ai algorithms-and-data-structures blog data-analysis data-science data-visualization educational machine-learning portfolio portfolio-website quarto r r-language statistics tutorials

Last synced: 30 Apr 2026

https://github.com/gerhynes/d3-births-pie-chart

A D3 pie chart showing UN birth data grouped by month and quarter. Built for The Advanced Web Developer Bootcamp.

d3 data-visualization javascript

Last synced: 30 Apr 2026

https://github.com/beyzabasarir/northwind-traders-analysis

Northwind dataset analysis using PostgreSQL, Python, and Power BI. Focused on sales, customers, shipping, and performance insights.

dashboard data-analysis data-visualization jupyter-notebook matplotlib numpy pandas postgresql powerbi python seaborn

Last synced: 10 Apr 2026

https://github.com/ddeepanshu-997/datascience-e-commerce-shopping-details-

in this project i am going to apply data preprocessing technique on the dataset in order to clean the data using libraries, etc. make some insights/analyses to findout the hotpicks of the shopping along with some data visualsation libraries to get the trends and many more aspects in order to make a small contribution to the field of data science

cleaning-data data data-science data-visualization dataframe datapreprocessing dataset libraries matplotlib-pyplot numpy pandas plots python visualization

Last synced: 30 Apr 2026

https://github.com/souravsuvarna/whatsapp-chat-analyzer-and-visualizer-web-application

The WhatsApp chat analyzer and visualizer uses NLP algorithms to analyze chat data, tracking usage patterns and presenting insights through visually appealing charts and graphs. It helps users understand communication patterns and behaviors on WhatsApp.

data-analysis data-science data-visualization python python3 streamlit

Last synced: 18 Apr 2026

https://github.com/miguelmedinacastro/trabalho-dados-r

Trabalho final da disciplina Análise Exploratória de Dados

data data-science data-science-projects data-visualization database r rstudio

Last synced: 01 May 2026

https://github.com/drisskhattabi6/text-to-sql

Chat with DB : A powerful web application that transforms natural language questions into executable SQL queries against a PostgreSQL or MySQL database and visualizes the results, Using Langchain (Ollama and ChromaDB), LangGraph and Streamlit

ai-agent chat-with-db chromadb data-visualization gemini langchain langgraph mysql ollama openai postgresql streamlit text-to-sql text2sql txt2sql

Last synced: 09 Apr 2026

https://github.com/gitchaell/computer-scrapping

Tool that extracts data from the pages of companies that sell computers in the city of Trujillo - Peru, exports them in an XLSX file according to a relational data model, and displays them on a Power BI dashboard.

data-analysis data-structures data-visualization database dbdiagram export-excel powerbi scrapper-script scrapping xlsx

Last synced: 01 May 2026

https://github.com/falakrana/data-analysis-visualization

This repository showcases data analysis and visualization projects using Python and Tableau. It includes exploratory data analysis, interactive dashboards, and insightful visual stories derived from real-world datasets.

data-analysis data-visualization python tableau-public

Last synced: 01 May 2026

https://github.com/kivanc57/explaratory_analysis

Exploratory and Descriptive Data Analysis on Indonesian data using R. This project involves reading data, feature analysis, correlation analysis, logistic regression, PCA, MDS, and clustering. Visualizations include boxplots, scatter plots, corrgrams, and dendrograms. Comprehensive report available in report.docx.

clustering data-science data-visualization descriptive-statistics explanatory-data-analysis mds pca plot r

Last synced: 08 Jun 2026

https://github.com/cdeweyx/bryce-harper-2016-analysis

Notebook analyzing Bryce Harper's disappointing 2016 campaign in historical context through data analytics.

data-analysis data-visualization python

Last synced: 01 May 2026

https://github.com/caesaredia/la-cafe-market-analysis

A data-driven feasibility study exploring the potential of launching a robot-staffed café in Los Angeles, based on real F&B business data.

business-intelligence cafe data-analysis data-visualization food-industry franchise los-angeles market-research pandas python

Last synced: 01 May 2026

https://github.com/rsn601kri/demand-pulse

Transforming Retail Supply Chains with AI-Powered Demand Forecasting & Generative Insights

aws data-visualization fastapi forecasting genai hackathon nextjs react walmart

Last synced: 09 Apr 2026

https://github.com/fbarffmann/project1

Analyzed factors influencing movie profitability using Python. Cleaned and visualized film industry data to uncover trends in budgets, sales, genres, and ratings.

box-office-analysis data-analysis data-visualization matplotlib movie-industry pandas python regression seaborn

Last synced: 01 May 2026

https://github.com/darkdk123/handwashing-discovery-analysis

A Guided Project in a Boot camp to Analyse the Original Data used in the Discovery of Viruses & Hand Washing By Dr. Ignaz Semmelweis in Vienna General Hospital in the 1840s.

data-analysis data-science data-visualization matplotlib-pyplot numpy pandas plotly-python python seaborn-plots

Last synced: 09 Apr 2026

https://github.com/anandvai/ai_rag_chatbot_multi_pdf_support

RAG (Retrieval-Augmented Generation) Chatbot built with Streamlit and LangChain, powered by Groq's blazing-fast LLaMA3-8B. It allows you to upload multiple PDFs, ask questions, and get precise, context-aware answers in a conversational format.

ai data data-science data-visualization data-visualizations dataengineering fastapi langchain langgraph python sql streamlit

Last synced: 01 May 2026

https://github.com/dineshram0212/youtube-analysis

This YouTube Analysis Package provides tools for analyzing YouTube video data, including metrics on views, likes, comments, and engagement trends. Ideal for gaining insights into video performance and audience interaction patterns.

data data-visualization pandas python webscraping youtube-api-v3

Last synced: 19 Jun 2026

https://github.com/jfaccioli/leaflet-earthquake

Geo mapping earthquakes with Leaflet / Javascript / GeoJSON

data-visualization geojson javascript json leaflet

Last synced: 01 May 2026

https://github.com/gabrieldiem/iss_locator

Little python script that plots the ISS (International Space Station) location in a world map at a given time

data-visualization pandas plotly python script

Last synced: 01 May 2026

https://github.com/samia35-2973/world-university-ranking-2023-prediction

This repository is about creating models for predicting world university rankings 2023. The World University Rankings 2023 dataset include 1,799 universities across 104 countries and regions, making them the largest and most diverse university rankings to date. A clean dataset is generated through data preprocessing.

data-cleaning data-preprocessing data-visualization decision-trees machine-learning machine-learning-algorithms model-training prediction world-university-rankings world-university-rankings-2023

Last synced: 01 May 2026

https://github.com/kushshriv/onlinejobpostings-infographic

The Python Data Cleaning Code and Input Dataset For My Telling Stories With Data Project

data-visualization pandas python

Last synced: 01 May 2026

https://github.com/alvitachen/breathe-retrospective

A dashboard that visualizes AQI and PM job openings across target cities.

aqi beginner charts data-visualization personal-project react vite

Last synced: 10 Apr 2026

https://github.com/saagpatel/signal-noise

An interactive essay teaching Bayesian reasoning through direct manipulation of live visualizations

bayesian d3 data-visualization education interactive-essay nextjs statistics typescript

Last synced: 28 Jun 2026

https://github.com/martindambrosio/ba-tree-census-analysis

Analysis and visualization of Buenos Aires urban trees using Python and Tableau, including interactive maps to explore species distribution and characteristics.

data-visualization folium-maps pandas python tableau

Last synced: 01 May 2026

https://github.com/guptakushal03/whatsapp-chat-analyser

The WhatsApp Chat Analyzer is a Python-based tool built with Streamlit for analyzing WhatsApp chat data. It provides insights such as total messages, word count, media shared, links shared, monthly activity timeline, most active users, activity maps, and word clouds.

chat-analysis data-analysis data-visualization python streamlit text-processing whatsapp word-cloud

Last synced: 01 May 2026

https://github.com/samuelson777/iris-flower-classification

Iris Flower Classification: A machine learning project that classifies iris flowers into three species based on sepal and petal dimensions. Includes data exploration, visualization, and model evaluation using Python and scikit-learn.

classification data-science data-visualization iris-dataset jupyter-notebook machine-learning python scikit-learn

Last synced: 09 May 2026

https://github.com/marcellobb/cars-eda

🚘 apply exploratory data analysis to a car dataset

data-visualization jupyter-notebook

Last synced: 19 Aug 2025

https://github.com/njaffe/eda_example_2025

Sample end-to-end data analysis walkthrough using Python and Scikit-learn.

data-science data-visualization jupyter-notebooks machine-learning python regression scikit-learn

Last synced: 09 May 2026

https://github.com/vaxdata22/redfin-analytics-etl-using-amazon-emr-by-airflow-on-ec2

This is an end-to-end AWS Cloud ETL project. This data pipeline uses an Amazon EMR cluster managed by Apache Airflow that is running on an AWS EC2 instance. It demonstrates how to build orchestration that would perform data transformation using Amazon EMR as well as automatic data ingestion into a Snowflake via Snowpipe. It also features Power BI.

amazon-emr-cluster apache-airflow apache-spark aws-ec2 aws-s3 business-intelligence dags data-visualization etl-pipeline google-colab-notebook orchestration power-bi pyspark redfin snowflake snowpipe sqs-queue

Last synced: 02 May 2026

https://github.com/subhadipsinha722133/credit-card-fraud-dection

Web application for detecting fraudulent credit card transactions using machine learning

data-visualization fraud-detection machine-learning matplotlib numpy pandas seborn sklearn streamlit

Last synced: 10 Apr 2026

https://github.com/quocduyenanhnguyen/airlines_web_scrapping

I scrapped airline data from a Wiki page with Python, did some data cleaning with Google Sheet and SQL, then visualized the data with Tableau.

airlines csv-files data-cleaning data-visualization mysql python3 tableau tableau-dashboards tableau-public webscraping

Last synced: 15 May 2026

https://github.com/rbreeze/dashboard

My personal health dashboard, with daily stats on food and sleep. Undergone several redesigns since 2015.

css dashboard data data-visualization design front-end google-sheets google-sheets-api health html javascript personal-health-record personal-website running static static-site visualization

Last synced: 02 May 2026

https://github.com/shridhar1504/milk-production-time-series-forecasting-datascience-project

This project uses time series forecasting to predict future milk production. The data used in this project is monthly milk production data from January 1962 to December 1975. The ARIMA (autoregressive integrated moving average) model is used to forecast the milk production. The model is evaluated using various metric.

adf arima-model augmented-dickey-fuller-test data-analysis data-analytics data-science data-visualization eda exploratory-data-analysis machine-learning machine-learning-algorithms python python3 residuals sarimax seasonality time-series time-series-forecasting trends

Last synced: 02 May 2026

https://github.com/teja-1403/ignosis-tech-ml-assignment

Analysis of transaction data to identify the most profitable products and key customer segments, providing insights for targeted marketing strategies.

customer-segmentation data-analysis data-visualization machine-learning marketing-strategy python

Last synced: 02 May 2026

https://github.com/youssef-saaed/zc-dashboard

The ZC Dashboard is a comprehensive data visualization tool designed to provide insights into the academic landscape of Zewail City.

amcharts dashboard data-visualization flask sqlite3

Last synced: 09 May 2026

https://github.com/gerhynes/d3-mobile-subscription-literacy-scatterplot

A D3 scatterplot showing mobile phone subscriptions against literacy rates. Built for The Advanced Web Developer Bootcamp.

d3 data-visualization javascript

Last synced: 02 May 2026

https://github.com/kimaruthagna/segmente

A journey through understanding customer segmentation using python with the general goal of encouraging data driven decision making

clustering crosstab customer-segmentation data-science data-visualization knn-classification lifetime-value pandas rfm-analysis seaborn

Last synced: 02 May 2026

https://github.com/harshindcoder/online_retail_data_clustering_project

This marketing analytics project uses RFM (Recency, Frequency, Monetary) features for customer classification, inspired by the online retail mining paper. The RFM model helps segment customers, identify high-value ones, and optimize marketing strategies.

customer-segmentation data-analysis data-visualization market-analytics

Last synced: 17 Aug 2025

https://github.com/amishidesai04/emergency-calls-data-analysis-project

Welcome to the Emergency Calls Data Analysis project repository. This project is dedicated to extracting, processing, and visualizing data from the "Emergency – 911 Calls, Montgomery County" dataset, sourced from Kaggle. The main objective is to analyze trends in emergency calls in Montgomery County, Pennsylvania, spanning multiple years.

analysis data-analysis data-extraction data-processing data-science data-visualization numpy pandas python seaborn

Last synced: 02 May 2026

https://github.com/rorrell/employmentdata

A Jupyter Notebook where I use group by to analyze the average unemployment rate by year

data-analysis data-visualization jupyter-notebook python3

Last synced: 02 May 2026

https://github.com/rfonod/narrative-visualization

Explores the relationships between countries' GDP, population, and cumulative Olympic medals. Features a narrative visualization of changes over time, critically examining the modern Olympic Games' original vision.

css d3 d3-visualization d3js data-visualization html javascript visualization

Last synced: 09 May 2026

https://github.com/carlosagalicia/sars-cov-2-sequence-analysis

The project explores genetic similarities and differences between SARS-CoV-2 variants, focusing on their distribution across Asian, Hispanic, European, and African populations.

biostrings data-science data-visualization ggplot2 r

Last synced: 20 Jun 2026

https://github.com/sakan811/stress-pattern-occurrence-in-english-words

This project is intended to provide English learners with data that allows them to make a data-driven guess when encountering words that they aren't sure where to stress

data-analysis data-visualization english english-language english-learning language powerbi powerbi-report powerbi-visuals

Last synced: 20 Jun 2026

https://github.com/holy-angel-university/global-cost-index-analysis

This analysis explores the cost of living across various countries, aiming to provide insights into economic disparities and living standards on a global scale. Utilizing a dataset that includes indices for overall cost of living, groceries, restaurant prices, and rent, we investigate the top and least expensive countries worldwide.

data-science data-visualization exploratory-data-analysis jupyter-notebook python3

Last synced: 02 May 2026

https://github.com/debjyotisaha/web-application-projects-streamlit-phase-2

This repository showcases interactive web applications built using the Streamlit framework.

dashboard data-visualization python streamlit

Last synced: 02 May 2026

https://github.com/benzerinsio/breastcancer-eda

📊 Análise Exploratória de Dados (EDA) - Câncer de Mama | Exploração de características clínicas para identificar padrões e relações no diagnóstico de câncer de mama.

analise-de-dados analise-exploratoria analise-exploratoria-de-dados data-analysis data-visualization diagnosis eda exploratory-data-analysis health-care medical-data python seaborn

Last synced: 02 May 2026

https://github.com/s1dewalker/electric-future

Visual Analysis: Future of Automotive Industry

data data-visualization machine-learning python3 regression-analysis tableau

Last synced: 02 May 2026

https://github.com/peter-gy/autovistype

Probing vision-language model alignment with human expert visual grouping over stratified sample of VIS30K dataset.

data-visualization google-genai langchain llm-benchmarking marimo meta-llama mistral multi-label-classification openai polars qwen uv vis30k vision-language-model visual-stimuli visualization-categorization vlm

Last synced: 02 May 2026

https://github.com/gerhynes/d3-notes-app

A simple notes app built to practice D3 selection methods. Built for The Advanced Web Developer Bootcamp.

d3 data-visualization javascript

Last synced: 03 May 2026

https://github.com/gkar90/gdp-vs-life-expectancy

Statistical analysis on GDP vs Life Expectancy

data-science data-visualization statistical-analysis

Last synced: 09 Jun 2026

https://github.com/nitheshgoutham/phonepe-pulse-data

Phonepe Pulse Data Visualization and Exploration: A User-Friendly Tool Using Streamlit and Plotly

data-science data-visualization plotly python sql streamlit

Last synced: 09 Apr 2026

https://github.com/isinghabhishek/data_analysis_with_python

Introduction to Data Analysis covering the basics of Python, Numpy, Pandas, Data Visualization, and Exploratory Data Analysis.

data-visualization exploratory-data-analysis numpy pandas python

Last synced: 03 May 2026

https://github.com/vincenzopalazzo/visualsars2chart

Visual analytics data COVID-19 (SARS 2) with python and Tableau

covd-19 covid-2019 covid19 data-visualization datacleaning dataset python3

Last synced: 03 May 2026

https://github.com/monteirooscar98/tarifas-publicas-sp-dieese

Extração de dados através de WebScraping no site do Dieese e Analise em relação as Tarifas Públicas do Município de São Paulo.

data-analysis data-visualization python webscraping

Last synced: 03 May 2026

https://github.com/farrelfaricaf/exploratorydataanalyst---titanic

This project analyzes the Titanic dataset using exploratory data analysis (EDA) and visualization techniques to identify survival patterns. The goal is to understand how demographic factors like gender and age influenced survival rates during the 1912 disaster.

data data-analysis data-science data-visualization eda python titanic-dataset

Last synced: 31 Jul 2025

https://github.com/miteshgupta07/covid-19-report-dashboard-using-streamlit

A Streamlit dashboard for COVID-19 reporting that provides real-time updates, visualizations, and analysis of global and local COVID-19 data to track the pandemic's progress and impact.

data-visualization python streamlit

Last synced: 03 May 2026

https://github.com/robwiederstein/kytc_loc

Plot Kentucky licensing locations

data-visualization ggmap leaflet r xml2

Last synced: 31 Jul 2025

https://github.com/james-julius/latent-space-explorer

A flythrough 3D map of meaning — type any concept and watch it land near related ideas. In-browser embeddings (no install, no key), pre-seeded scenes, and a multi-model knowledge explorer.

3d-visualization ai claude data-visualization embeddings gemini latent-space llm machine-learning nextjs openai react-three-fiber semantic-search text-embeddings threejs transformers-js typescript umap vector-search webgpu

Last synced: 09 Jun 2026

https://github.com/nix7amcm/fcc-data-viz-cert-projects

These are my projects for the freeCodeCamp Data Visualization certification.

d3 d3-visualization d3js data-visualization data-viz freecodecamp freecodecamp-project html-css-javascript

Last synced: 03 May 2026