An open API service indexing awesome lists of open source software.

Data visualization

Data visualization is the visual depiction of data through the use of graphs, plots, and informational graphics. Its practitioners use statistics and data science to convey the meaning behind data in ethical and accurate ways.

https://github.com/vaxdata22/redfin-analytics-etl-using-amazon-emr-by-airflow-on-ec2

This is an end-to-end AWS Cloud ETL project. This data pipeline uses an Amazon EMR cluster managed by Apache Airflow that is running on an AWS EC2 instance. It demonstrates how to build orchestration that would perform data transformation using Amazon EMR as well as automatic data ingestion into a Snowflake via Snowpipe. It also features Power BI.

amazon-emr-cluster apache-airflow apache-spark aws-ec2 aws-s3 business-intelligence dags data-visualization etl-pipeline google-colab-notebook orchestration power-bi pyspark redfin snowflake snowpipe sqs-queue

Last synced: 02 May 2026

https://github.com/harshindcoder/salifort_motors_project

This people analytics project analyzes factors influencing employee turnover and predicts whether an employee is likely to leave. It aims to uncover patterns behind departures, helping Salifort improve retention, workplace culture, and professional growth strategies.

data-analysis data-science data-visualization hr-analytics machine-learning tree-models

Last synced: 02 May 2026

https://github.com/rbreeze/dashboard

My personal health dashboard, with daily stats on food and sleep. Undergone several redesigns since 2015.

css dashboard data data-visualization design front-end google-sheets google-sheets-api health html javascript personal-health-record personal-website running static static-site visualization

Last synced: 02 May 2026

https://github.com/shridhar1504/milk-production-time-series-forecasting-datascience-project

This project uses time series forecasting to predict future milk production. The data used in this project is monthly milk production data from January 1962 to December 1975. The ARIMA (autoregressive integrated moving average) model is used to forecast the milk production. The model is evaluated using various metric.

adf arima-model augmented-dickey-fuller-test data-analysis data-analytics data-science data-visualization eda exploratory-data-analysis machine-learning machine-learning-algorithms python python3 residuals sarimax seasonality time-series time-series-forecasting trends

Last synced: 02 May 2026

https://github.com/gerhynes/d3-mobile-subscription-literacy-scatterplot

A D3 scatterplot showing mobile phone subscriptions against literacy rates. Built for The Advanced Web Developer Bootcamp.

d3 data-visualization javascript

Last synced: 02 May 2026

https://github.com/hafs96/prediction_consommation-de-carburant

Dans ce projet, l'objectif est de développer un modèle permettant de prédire si une voiture a une consommation de carburant élevée ou faible en fonction de ses caractéristiques techniques.

analysis data data-visualization machine-learning testing training

Last synced: 09 Jun 2026

https://github.com/kimaruthagna/segmente

A journey through understanding customer segmentation using python with the general goal of encouraging data driven decision making

clustering crosstab customer-segmentation data-science data-visualization knn-classification lifetime-value pandas rfm-analysis seaborn

Last synced: 02 May 2026

https://github.com/ronitjariwala/prodigy_ds_04

Prodigy InfoTech Data Science Internship Task-4

data-analysis data-science data-visualization python

Last synced: 02 May 2026

https://github.com/bhawnagoyal18/ai-doctor-a-symptom-checker-disease-predictor

AI Doctor is an intelligent healthcare application that utilizes machine learning (ML) and Python to predict potential diseases based on user-input symptoms. The project integrates data from multiple medical datasets and provides an interactive web-based UI for an intuitive user experience.

data-analysis data-engineering data-visualization dataset flask html5 machine-learning python sql stacking statistics

Last synced: 02 May 2026

https://github.com/fatihilhan42/spotify-songs-recommendations-system_with_python

We developed a song recommendation system for the user with the data we received from our Spotify song dataset. Data set and other applications are given in the description. Have a nice day.

data-analysis data-science data-visualization jupyter-notebook python recommendation-engine recommendation-system

Last synced: 02 May 2026

https://github.com/debjyotisaha/web-application-projects-streamlit-phase-2

This repository showcases interactive web applications built using the Streamlit framework.

dashboard data-visualization python streamlit

Last synced: 02 May 2026

https://github.com/benzerinsio/breastcancer-eda

📊 Análise Exploratória de Dados (EDA) - Câncer de Mama | Exploração de características clínicas para identificar padrões e relações no diagnóstico de câncer de mama.

analise-de-dados analise-exploratoria analise-exploratoria-de-dados data-analysis data-visualization diagnosis eda exploratory-data-analysis health-care medical-data python seaborn

Last synced: 02 May 2026

https://github.com/peter-gy/autovistype

Probing vision-language model alignment with human expert visual grouping over stratified sample of VIS30K dataset.

data-visualization google-genai langchain llm-benchmarking marimo meta-llama mistral multi-label-classification openai polars qwen uv vis30k vision-language-model visual-stimuli visualization-categorization vlm

Last synced: 02 May 2026

https://github.com/femincan/d3-treemap-diagram

My solution for the Visualize Data with a Treemap Diagram project on FCC.

css3 d3js data-visualization html5 javascript

Last synced: 02 May 2026

https://github.com/gkar90/gdp-vs-life-expectancy

Statistical analysis on GDP vs Life Expectancy

data-science data-visualization statistical-analysis

Last synced: 09 Jun 2026

https://github.com/rezowanrahat/netflix_analysis

Data analysis of Netflix content using Python, Pandas, and Seaborn

data-analysis data-visualization netflix pandas python

Last synced: 07 May 2026

https://github.com/haroontrailblazer/user_behavioral_analysis

Social Media User Engagement Analysis Using Power BI

data-analysis data-science data-visualization database powerbi

Last synced: 29 Mar 2025

https://github.com/arction/lcjs-example-0017-largelinechartxy

Example visualization of large line chart (several million data points)

data-visualization lightningchart-js line-chart template

Last synced: 12 Mar 2025

https://github.com/jessicaevelin/datascience

Repositório com atividades, exercícios e projetos realizados durante meus estudos em Ciência de Dados, baseados em cursos, livros, vídeos e conteúdos da internet.

data-science data-visualization exercises jupyter machine-learning pandas projects python study

Last synced: 21 Jun 2025

https://github.com/callmemaverick/game-of-thrones-investigating-episodes

Data Science project to analyze the duration of Game of Thrones episodes

data-science data-visualization matplotlib pandas-python python

Last synced: 19 Dec 2025

https://github.com/alpkanoz/ibm_data_science_professional_certificate

The repository contains projects and training materials carried out throughout the IBM data science professional course.

classification clustering data-analysis data-science data-visualization dataframe ibm ibm-watson machine-learning mathplotlib pandas predictive-modeling python scikit-learn

Last synced: 07 Mar 2026

https://github.com/hirudikaanupama/student-score-prediction-linear-regression

Here the prediction and analysis of student scores using selected features is done entirely by linear regression machine learning algorithm. This project covers all methods of linear regression theory.

cross-validation data-cleaning data-visualization hyperparameter-tuning jupiter-notebook lasso-regression linear-regression machine-learning-algorithms multiple-linear-regression prediction-model python regularization ridge-regression student-score-prediction

Last synced: 26 Apr 2026

https://github.com/kate8382/frontend-module

Frontend module for a web application with user authentication, real-time dashboard, and data management

authentication dashboards data-visualization frontend

Last synced: 21 Jun 2025

https://github.com/anuj7411/bankofbaroda-candlestick-dashboard

An interactive stock market visualization project using Python, Pandas, and Plotly to analyze Bank of Baroda price movement through a candlestick dashboard.

candlestick-chart dashboard data-visualization financial-data jupyter-notebook pandas plotly python stock-market time-series

Last synced: 17 May 2026

https://github.com/bayunova28/healthcare_analytics

This repository contains about data analytics project from healthcare industry

data-analytics data-engineering data-visualization healthcare pyspark sql

Last synced: 21 Jun 2025

https://github.com/hirudikaanupama/email-spam-detection-logistic-regression

This model can predict whether an email is spam or not. The logistic regression machine learning algorithm is used to train this model.

accuracy-score classification classification-report confusionmatrix data-visualization logistic-regression machine-learning roc-curve

Last synced: 11 Sep 2025

https://github.com/dmytrori/himalayan_expeditions

Himalayan expedition stats, 1905–2020

alpinism data-analysis data-visualization pandas-python

Last synced: 21 Jun 2025

https://github.com/pjaiswalusf/heart-failure-prediction

A machine learning project predicting heart failure risk using Random Forest and XGBoost. It involves data cleaning, feature engineering, and EDA before training. The best model is saved using Joblib. Key techniques: outlier detection, feature scaling, and optimization.

data-processing data-visualization feature-engineering machine-learning model-training optimization random-forest-classifier saving-model xgboost

Last synced: 07 Mar 2026

https://github.com/usk2003/vnrvjiet-lab-work

This repository contains my lab work for the B.Tech CSE-AIML program (2022-2026) under the R22 regulation at VNR Vignana Jyothi Institute of Engineering and Technology. It includes various subjects like Machine Learning, OS, Data Structures, C Programming, and more, showcasing my practical learning and implementations.

c-programming compiler-design computer-networks data-engineering data-structures data-visualization dbms engineering-drawing java machine-learning operating-system python software-engineering

Last synced: 11 Apr 2026

https://github.com/rajesh9943/decoding-sales-patterns-strategic-insights-from-data

To identify the key drivers of sales and uncover patterns for strategic decision-making. This involves analyzing purchasing behavior by area, town, and commodity type, while also tracking customer choices over time. Descriptive statistics and time series analysis were used to reveal key sales trends across a four-year period.

data-analytics data-processing data-visualization data-wrangling reporting sales-analysis sales-growth

Last synced: 11 Jul 2025

https://github.com/samaalharbi2/virtual-work-experience---data-analysis-at-stc

Virtual Work Experience in Data Analysis at STC

analysis data data-visualization misk stc

Last synced: 20 Jun 2025

https://github.com/hariprasath-v/machinehack-music_genre_classification_weekend_hackathon_edition_2

predict the genre of the songs from tunable audio track features like energy, tempo, key, mode, and valence, and others.

data-visualization exploratory-data-analysis machine-learning

Last synced: 17 Apr 2026

https://github.com/omerdduran/riskfactor-heart

This ML project predicts heart disease using logistic regression on the Cleveland Heart Disease UCI dataset, featuring advanced preprocessing and medical feature engineering, achieving 82.1% accuracy with strong cross-validation.

cardiovascular-health data-science data-visualization heart-disease-prediction logistic-regression machine-learning medical-ai scikit-learn

Last synced: 14 May 2026

https://github.com/tashi-2004/apache-hadoop-spark-hive-cyberanalytics

This project utilizes Apache Hadoop, Hive, and PySpark to process and analyze the UNSW-NB15 dataset, enabling advanced query analysis, machine learning modeling, and visualization. The project demonstrates efficient data ingestion, processing, and predictive analytics for network security insights.

ai apache-hadoop apache-hive big-data-analytics big-data-processing data-analysis data-engineering data-science data-security data-visualization hdfs machine-learning network-analysis network-security pyspark python3 threat-detection unsw-nb15-dataset

Last synced: 02 May 2026

https://github.com/rmrt1n/chess_analysis_project

Webscraping and analysing games of Hikaru Nakamura

chess data-analytics data-visualization eda rvest tidyverse web-scraping

Last synced: 15 Jan 2026

https://github.com/hemangsharma/bookingdataanalysisreport

The report helps understand key trends and insights around customer bookings, pricing, and other related attributes.

analysis data data-analysis data-analytics data-visualization streamlit streamlit-dashboard

Last synced: 14 May 2026

https://github.com/jbalooshie/stock-analysis

A VBA script that performs basic stock analysis. Created while participating in a Data Analytics Bootcamp.

data-science data-visualization excel microsoft vba vba-excel vba-macros vba-script

Last synced: 20 Jan 2026

https://github.com/saisurajmatta/cryptocurrency-market-analyzer-python-project

Cryptocurrency Market Analyzer: Python script utilizing CoinMarketCap API to fetch, analyze, and visualize real-time trends of top 15 cryptocurrencies over different time intervals.

data-analytics data-visualization matplotlib pandas python seaborn

Last synced: 05 May 2026

https://github.com/kgelli/news-sentiment-analysis-pipeline-with-microsoft-fabric

End-to-end news sentiment analysis pipeline built with Microsoft Fabric, analyzing Bing News API data with sentiment analysis, visualization in Power BI, and real-time alerts via Teams

azure bing-api data-activator data-engineering data-pipeline data-visualization fabric microsoft-fabric one-lake-synapse power-bi sentiment-analysis

Last synced: 10 May 2026

https://github.com/theshefer/covid-map

Interactive map showing covid data implemented on R language

big-data data-visualization r r-studio

Last synced: 20 Jun 2025

https://github.com/toluwaa-o/stears-lite-overview

Central overview repository for the Stears Lite project — documentation, resources, and links to frontend and backend repositories.

africa charts data data-aggregation data-visualization documentation fastapi nextjs project-overview

Last synced: 14 May 2026

https://github.com/sayamalt/credit-card-approval-prediction

Successfully developed a machine learning model which can accurately predict up to 100% accuracy whether a credit card application of a given applicant would be approved or not, based on several demographic features such as applicant age, total income, marital status, total years of work experience, etc.

binary-classification cicd-deployment cross-validation data-exploration-and-preprocessing data-visualization exploratory-data-analysis feature-engineering hyperparameter-optimization machine-learning model-deployment model-retraining model-selection model-testing model-training-and-evaluation

Last synced: 09 Nov 2025

https://github.com/zulfachafidz/green_horizon_forecasting_peak_organic_avocado_sales_with_the_prophet_algorithm

The Green Horizon Project leverages the Prophet algorithm to predict peak sales of organic avocados, supporting the campaign "APEAM GO ORGANIC." Using Python and Looker Studio, this analysis aims to provide deep insight into sales trends and potential, forming the basis of smarter marketing strategies.

algorithm algorithms analytics data data-analysis data-engineering data-mining data-science data-visualization forecasting machine-learning machine-learning-algorithms prophet-model python python-script

Last synced: 17 May 2026

https://github.com/aran203/cricanalytics

ADSC Fall 24 Project for cricket analytics with hawkeye data

data-engineering data-visualization python streamlit

Last synced: 14 May 2026

https://github.com/casperkristiansson/finance-tracker

A project which solved an issue of mine which was tracking my finance. This Finance Tracking application gives overviews of expenses and income to give its users an easy way to explore their data.

dashboard data-visualization finance-management firebase-auth react

Last synced: 29 Dec 2025

https://github.com/ahmetzamanis/clusteringcountry

Non-hierarchical k-medoids clustering on a dataset of country statistics.

clustering data-science data-visualization k-medoids machine-learning r rmarkdown unsupervised-learning

Last synced: 16 Dec 2025

https://github.com/benmar2406/rent-in-germany

Interactive visualizations and maps depicting topics around rent prices and income in Germany built with Svelte.

charts d3 d3-visualization d3js data-analysis data-visualization gis gis-data infographic infographics map mapbox mapbox-gl mapbox-gl-js mapboxgl svelte

Last synced: 26 Mar 2025

https://github.com/madrury/hot-sauce

Simuation of a Hot Sauce Spicyness Dataset

data-analysis data-science data-visualization dataset machine-learning

Last synced: 16 May 2026

https://github.com/aglowraph/gromacs-xvg-plot-script

A Python script for automating the plotting of .xvg files from GROMACS simulations, with dynamic labeling, time unit detection, and colorful visualization. This script reads, plots, and saves each .xvg file in the same directory, making data analysis more efficient.

automation computational-chemistry data-visualization gromacs matplotlib molecular-dynamics numpy python scientific-computing xvg-plotting

Last synced: 18 May 2026

https://github.com/whisplnspace/insightgenie

InsightGenie is an AI-powered data analyst that lets you upload files, ask questions, and get insights with visualizations

data-analysis data-science data-visualization deployment gemini-api huggingface nlp

Last synced: 19 Jun 2025

https://github.com/as16082023/manufacturing-downtime-analysis

In the Maven Analytics data challenge, analyzed manufacturing downtime for a soda production company using Excel, identifying key issues and root causes of delays. Insights were shared through tables, charts, and a concise report with actionable recommendations.

advanced-excel data-visualization excel

Last synced: 20 Jan 2026

https://github.com/lucasfloresc/final_project

This is the final project of the Ironhack Bootcamp. In this project I applied all methods and tecniques learned in the Bootcamp, such as Web Scrapping and API extraction, Data cleaning and processing with Python, Python logic, the implementation of machine learning and Data Visualization. All displayed in Streamlit for more user friendly interface

data-analysis data-visualization machine-learning python streamlit webscraping

Last synced: 08 May 2026

https://github.com/naomiwolfe/golden-isles-dashboard2

Interactive tourism analytics dashboard for Georgia's Golden Isles

analytics chartjs dashboard data-visualization georgia golden-isles tailwindcss tourism

Last synced: 05 Oct 2025

https://github.com/rafinha0rafinha/web-analyzer-backend

(Legacy) This is the backend for Mazaoro SARLU's lead magnet "Web Analyzer". This project analyzes websites using Google Lighthouse and returns a detailed report consumed by the frontend.

azure-app-service azure-devops chartjs cicd data-analysis data-science data-visualization express flask hacktoberfest lighthouse numpy sentiment-analysis vader-sentiment-analyzer

Last synced: 10 Apr 2026

https://github.com/mkk-1817/cvip-ds-exploratory_data_analysis-terrorism

This repository deals with exploring global terrorism trends analyzing the Global Terrorism Database to uncover temporal patterns, identify top terrorist groups, examine attack types, and gain insights into geographical and success/failure dynamics.

coderscave data-analysis data-science data-visualization eda exploratory-data-analysis python terrorism-analysis

Last synced: 19 Jun 2025

https://github.com/kaczmarj/car-safety-shiny

An R Shiny app -- final project for BMI 530

cars data-visualization nhtsa shiny visualization

Last synced: 02 Feb 2026

https://github.com/saravanansuriya/streamlit

Streamlit Tutorial for machine learning and data science.

data-visualization python-script streamlit-webapp

Last synced: 18 May 2026

https://github.com/sharinas/mapped_travel_locations

A web-based Python mapping project of specific places around the world, with interactive pop-ups and color coded markers. Project uses folium, pandas, python, and a .csv file to store data.

csv data-visualization folium mapping pandas pipenv python

Last synced: 18 May 2026

https://github.com/tanyakuznetsova/music_mental_health

Harnessing music's power for better mental health: genre recommendations and data-driven analysis of listeners' trends

data-visualization decision-tree decision-tree-classifier exploratory-data-analysis k-means-clustering pca-analysis recommendation-system recommender-system surprise-python

Last synced: 11 Jul 2025

https://github.com/ianjure/simple-corr

A simple data correlation visualizer built in Streamlit.

data-visualization streamlit

Last synced: 18 May 2026

https://github.com/mah-22/room-occupancy-prediction-using-environmental-sensor-data

This project uses environmental sensor data to predict room occupancy, providing valuable insights for efficient energy management and space utilization in buildings. By analyzing factors like temperature, humidity, and light levels, the model aims to accurately forecast when rooms will be occupied, optimizing resources and enhancing overall buildi

classification data-science data-visualization exploratory-data-analysis machine-learning numpy pandas python seaborn time-series

Last synced: 07 May 2026

https://github.com/yash22222/olympic-games-analytics-using-apache-spark

The "Olympic Games Analytics Using Apache Spark Databricks" project explores data from the Olympic Games (1896-2016) to identify trends and insights. Using Apache Spark for big data processing and Databricks for visualization, the project analyzes key factors like top-performing countries and athlete attributes, showcasing real-world analytics.

apache apache-kafka apache-spark big-data-analytics csv data data-analytics data-visualization databricks excel mysql olympics regions

Last synced: 03 May 2026

https://github.com/marco210210/football-analytics

Football Analytics is a project that collects, analyzes, and visualizes performance data for football teams and players during the Serie A 2017/18 season, using database structures and machine learning models to provide insights into match events and player actions.

data-analytics data-preprocessing data-visualization football-analytics football-performance-analysis machine-learning mongodb mplsoccer python sports-data

Last synced: 27 Feb 2026

https://github.com/mae776569/weratedogs-wrangling

Wrangling WeRateDogs Twitter data to create interesting and trustworthy analyses and visualizations

data-analysis data-science data-visualization tweets twitter-api

Last synced: 25 Jan 2026

https://github.com/sadratehranian/pem-fuel-cell

The methodology section details the use of Python for data processing and analysis, employing statistical and machine learning-based anomaly detection techniques to identify potential issues in fuel cell stacks. It emphasizes data preprocessing, feature engineering, exploratory data analysis (EDA), and anomaly detection.

anomaly-detection data-analysis data-science data-visualization exploratory-data-analysis feature-engineering fuel-cell machine-learning preprocessing python statistical-analysis visual-studio-code

Last synced: 26 Mar 2025

https://github.com/codeslash21/explore_weather_trend

Weather trends analysis of the World and the Delhi (India) city over last 150 years.

data-visualization nanodegree-project python3 sql

Last synced: 03 Jul 2026

https://github.com/yaph/gh-browser-cloud

A word cloud based on browser mentions in GitHub commit messages.

big-query data-processing data-visualization github webbrowser wordcloud

Last synced: 16 May 2026

https://github.com/nafisrayan/crypto-trading-platform

This React Crypto Exchange Template is designed to provide a solid foundation for building a comprehensive cryptocurrency exchange platform. With its sleek and modern design, this template is perfect for anyone looking to create a user-friendly and intuitive trading experience.

crypto dashboard data-analysis data-visualization react template

Last synced: 16 May 2026

https://github.com/willmeyers/usgs-groundwater-trends

Visualized USGS groundwater level trends

data-visualization

Last synced: 30 Oct 2025

https://github.com/rafay99-epic/metricmate

Metric Mate is a modern, Python-based GUI tool for visualizing and analyzing gaming performance metrics with a sleek Tokyo Night theme.

data-visualization python python-gui-tkinter python-script

Last synced: 11 May 2025

https://github.com/benzerinsio/onlineretail-tableau

📊 Um dashboard interativo básico criado no Tableau para explorar vendas de uma loja online, com visualizações de receita por região e tendências temporais.

data-visualization eda sales-analysis tableau visualizacao-de-dados

Last synced: 09 Feb 2026

https://github.com/luka-j/csw5-eda

Materials for CS Week 5 lecture on exploratory data analysis

data-visualization r shiny tidyverse

Last synced: 26 Apr 2026

https://github.com/danitilahun/exploratory-data-analysis-projects

This repository contains a collection of my personal Exploratory Data Analysis (EDA) projects. Each project involves exploring various datasets to gain insights, uncover patterns, and visualize trends.

data-analysis data-science data-visualization exploratory-data-analysis python

Last synced: 16 May 2026

https://github.com/shihjen/startup_grant_dashboard

Dashboard for Monitoring Research Laboratory Expenses

dashboard data-visualization python streamlit

Last synced: 07 May 2026

https://github.com/arction/lcjs-example-0009-severalaxisxy

A demo application showcasing using multiple axes in LightningChart JS.

axis chart data-visualization lcjs lightningchart-js

Last synced: 12 Mar 2025

https://github.com/bretsw/eme6356-su26-module5

Slide deck for EME6356, Module 5: Data Visualization (Summer 2026)

analytics data-analytics data-visualization slides

Last synced: 02 Jul 2026

https://github.com/mfakhriazhar/ecom-qtt-prediction

In e-commerce, understanding seasonal sales trends and best-selling products is critical to business strategy. However, companies often struggle with predicting sales, determining factors that influence sales (discounts, product categories, locations), and optimizing stock and marketing.

data-analysis data-science data-visualization e-commerce-project eda machine-learning python

Last synced: 19 May 2026