An open API service indexing awesome lists of open source software.

Data visualization

Data visualization is the visual depiction of data through the use of graphs, plots, and informational graphics. Its practitioners use statistics and data science to convey the meaning behind data in ethical and accurate ways.

https://github.com/aplgr/grovegrid

Interactive growth maps for rows of trees. CSV in; single HTML out.

alpinejs cli csv data-visualization echarts forestry go heatmap

Last synced: 25 May 2026

https://github.com/niteshchawla/logistics-nn-regression

The case study is about India's Largest Marketplace for Intra-City Logistics. This dataset has the required data to train a regression model that will do the delivery time estimation, based on all those features.

adam-optimizer data-visualization encoding exploratory-data-analysis feature-engineering hidden-layers hyperparameter-tuning keras-tensorflow kerastuner metrics neural-network numpy pandas regression relu scaling sequential-models

Last synced: 10 Apr 2026

https://github.com/atulkp018/investwisesharpe

This project analyzes stock performance using the Sharpe Ratio to evaluate risk-adjusted returns. It includes data visualization, EDA, and Sharpe Ratio computation for Facebook and Amazon stocks, compared against the S&P 500 benchmark.

data-visualization investment-analysis matpoltlib pandas python stock-analysis

Last synced: 10 Apr 2026

https://github.com/an4pdm/data_analysis_escolar

Projeto de BD com dados fornecidos pelo "Portal de Dados Abertos" , feito com o intuito de praticar minhas habilidades em SQL.

analise-de-dados data-visualization database mysql project-repository sql study

Last synced: 10 Apr 2026

https://github.com/swethajoseph/urological-cancer-referral-forecast

Analysing and forecasting urological cancer referral patterns for NHS Scotland, aiming to improve management and operational efficiency.

data-visualization datacleaning excel forcasting statistical-analysis tableau time-series-analysis

Last synced: 04 Jan 2026

https://github.com/0xarchit/covid-data-dashboard

This repo consists files related to Data Visualization Covid Data Dashboard Assignment

covid-19 covid19-data dashboard data-visualization streamlit

Last synced: 10 Apr 2026

https://github.com/jaymax01/dvd-rental-data-analysis

Data analysis of a DVD rental database

data-visualization postgresql sql

Last synced: 22 Jul 2025

https://github.com/azaz9026/data_cleaning

Welcome to the Data Cleaning repository! This collection is dedicated to showcasing techniques and methods for cleaning and preparing datasets for analysis.

data-analysis data-engineering data-structures data-visualization eda feature-engineering machine-learning numpy outliers pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/samruddhi3012/insurance-price-forecasting

This is a Machine Learning project where I performed EDA and forecasted the insurance pricing using Linear Regression and XGBoost Regressor.

bayesian-optimization data-visualization exploratory-data-analysis linear-regression machine-learning pipeline statistical-analysis xgboost-regression

Last synced: 31 Aug 2025

https://github.com/sanand0/booksviz

LLM-generated visual insights from the GoodReads 100K dataset

data-visualization llm

Last synced: 20 Jan 2026

https://github.com/brianyu28/old-sheets-flying

Data analysis and graphics tool for The Harvard Crimson's Data and Design Teams

data-visualization harvard-university journalism

Last synced: 15 May 2025

https://github.com/ts-kontakt/interpareto

Python utility for creating interactive Pareto charts from pandas.DataFrame objects

data-visualization html-export pandas plotly

Last synced: 01 Sep 2025

https://github.com/anuppm9917/ecommerce-store-power-bi-projects

Its a Power BI dashboard for Ecommerce sales analysis that can give valuable insights for the business growth.

csv-files dashboard data-visualization dataanalysis powerbi

Last synced: 04 Jan 2026

https://github.com/metalbolicx/tipviz

Show data only by hovering an element in your D3.js project.

d3 d3-js d3js data-visualization tooltip

Last synced: 20 Jan 2026

https://github.com/yash-rewalia/airbnb_eda_pandas

The goal of the project is to gather information and analyze the detailed information of the different entries in order to provide insights about the host and price of the property in a particular area as per your preference , type of rooms and number of reviews accordingly.

data data-cleaning data-insights data-preprocessing data-visualization matplotlib numpy pandas python seaborn

Last synced: 11 Apr 2026

https://github.com/teilomillet/mentors

Text JSONL dataset visualiser and modifier.

data-visualization datasets llm-training

Last synced: 10 Feb 2026

https://github.com/maettuu/project-beatblend

Repository for the Master's Project 2024 on Visualizing and Explaining Sequential Song Recommendations through Data Humanism

audio-features aws ci-cd content-based-recommendation data-visualization discogs-api docker fastapi full-stack jwt-token masters-project postgres python recommendation-system redis rest-api spotify-api visual-data vuejs websocket

Last synced: 11 Apr 2026

https://github.com/mjanez/spain-cultural-pulse

Interactive web app to explore contemporary Spanish culture, values, politics & social norms with beautiful data visualizations (Next.js + Leaflet + Recharts + D3). Based on 2024 nationwide survey (3k respondents).

csic culture d3js data-visualization i18n nextjs norpol open-data politics social-norms sociology spain spain-culture spain-politics survey-data tailwindcss

Last synced: 13 Jan 2026

https://github.com/shubham200137/icc-women-s-t20-world-cup-data-analytics

Created a Power BI report to identify top 11 players for a T20 cricket team by scraping data from espncricinfo with Python, cleaning and transforming the data with pandas, and evaluating various player performance metrics.

beautifulsoup4 data-analysis data-visualization numpy-python pandas-python powerbi web-scraping

Last synced: 25 Feb 2025

https://github.com/lotfiferaga/google-play-store-sentiment-analysis

Perform sentiment analysis on Google Play Store reviews using Python. Analyze user feedback to determine the overall sentiment (positive, negative, or neutral) towards various apps. Gain insights to aid developers and businesses in understanding user satisfaction levels and improving their products.

data-analysis data-visualization googleplayservices python reviewsanalysis-nlp

Last synced: 26 Feb 2025

https://github.com/weybsonalves/prevendo-o-atrito-de-clientes

Projeto em que percorro as etapas que compõem o ciclo de vida da ciência de dados a fim de prever o atrito de clientes do serviço de cartões de crédito de um banco.

data-analysis data-science data-visualization machine-learning python

Last synced: 06 May 2026

https://github.com/apsinghanalytics/hranalytics_myersbriggspersonalityinsights

A Excel analytics study exploring the correlation between personality traits and key HR-relevant parameters, including tenure and performance

data-analysis data-visualization excel pivot-tables

Last synced: 30 Jan 2026

https://github.com/ultra-bugs/pyside6-datatable-widget

A PySide6 DataTable widget with jQuery DataTable-like functionality

data-visualization desktop-app desktop-application gui pyside6 qt qt6 table

Last synced: 30 Jun 2025

https://github.com/arslanr369/eda-journey

Exploratory data analysis (EDA) and visualization projects focusing on diverse datasets, including Bitcoin price trends and Indian restaurant reviews. Each notebook aims to provide insights and showcase data storytelling through visual exploration.

bitcoin data-science data-visualization eda

Last synced: 14 Mar 2025

https://github.com/vidushibhadana/eda-on-nyc-taxi-data

About Conducting an Exploratory Data Analysis (EDA) on New York City taxi data and visualizing it through countplots, distribution plots (displot), and histograms using Python and it's libraries.

data data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 11 Apr 2026

https://github.com/christs8920/process-mining-py

A process mining project that analyzes an event log and discovers its process model.

data-science data-visualization datavisualization pm4py process-mining processmining python

Last synced: 26 May 2026

https://github.com/usk2003/vnrvjiet-lab-work

This repository contains my lab work for the B.Tech CSE-AIML program (2022-2026) under the R22 regulation at VNR Vignana Jyothi Institute of Engineering and Technology. It includes various subjects like Machine Learning, OS, Data Structures, C Programming, and more, showcasing my practical learning and implementations.

c-programming compiler-design computer-networks data-engineering data-structures data-visualization dbms engineering-drawing java machine-learning operating-system python software-engineering

Last synced: 11 Apr 2026

https://github.com/aphp/jupyter-eds-notebooks

jupyter-eds-notebooks provides Docker images with preconfigured Jupyter environments for clinical and health data analysis, tailored for AP‑HP Datalabs and the HELIX platform.

data-analysis data-science data-visualization healthcare lab

Last synced: 13 Jan 2026

https://github.com/naomiwolfe/golden-isles-dashboard2

Interactive tourism analytics dashboard for Georgia's Golden Isles

analytics chartjs dashboard data-visualization georgia golden-isles tailwindcss tourism

Last synced: 05 Oct 2025

https://github.com/ved-coder-king/wheat_ai_project

This project, Smart Wheat Farming AI System, was developed as part of the coursework for the Artificial Intelligence program at Esprit School of Engineering.

agriculture data-analysis data-visualization deep-learning image-classification machine-learning object-detection python wheat

Last synced: 15 Apr 2025

https://github.com/cr2007/d3js-us-electric-vehicle-dashboard

D3.js Interactive Visualization Dashboard - US Electric Vehicle Population

d3-js dashboard data-visualization electric-vehicles interactive-visualizations tesla

Last synced: 15 Mar 2025

https://github.com/samanhur/data_visualization_pcc

First experiences in data visualization with python

data-analysis data-science data-visualization python3

Last synced: 23 Mar 2025

https://github.com/abhay-sinha-0/carpricepredictionproject

A machine learning project that predicts the selling price of a car based on its features such as year, mileage, fuel type, transmission, and more. This model can assist individuals and dealerships in estimating fair market prices for used cars.

artificial-intelligence data-analysis data-science data-visualization exploratory-data-analysis machine-learning-algorithms matplotlib-pyplot mysql-database numpy-library pandas-library python skit-learn sklearn-library

Last synced: 15 May 2025

https://github.com/felinjob/ibm-applied-data-science-capstone

Este projeto, parte da especialização IBM Data Science Professional Certificate, prevê o sucesso do pouso do Falcon 9 da SpaceX. Usando dados da API da SpaceX e Web Scraping, o projeto inclui análise de dados e Machine Learning para gerar insights sobre os lançamentos.

data-analysis data-science data-visualization ibm jupyter-notebook machine-learning numpy pandas python scikit-learn seaborn sql

Last synced: 11 Apr 2026

https://github.com/miteshgupta07/streamlit-machine-learning-app

A Streamlit application for interactive exploratory data analysis (EDA) and data visualization, offering dynamic tools to analyze and visualize machine learning datasets.

data-visualization python streamlit

Last synced: 27 Apr 2026

https://github.com/raghavendranhp/phonepe-pulse-data-visualization-and-exploration

This code clones PhonePe data from GitHub. After processing the data, it is displayed in an appealing manner to gain insights from PhonePe's information. This can be used to increase productivity, profits, and focus specifically on business development.

data-visualization githubclone mysql mysqlconnector pandas plotly plotly-dash python sqlalchemy streamlit visualization

Last synced: 11 Apr 2026

https://github.com/amanyadav-07/customer-churn-prediction

Machine Learning project to predict customer churn using Logistic Regression, Random Forest, and XGBoost. Includes data preprocessing, feature engineering, SMOTE balancing, model training, evaluation, and business insights.

accuracy-metrics data-analysis data-visualization logistic-regression machine-learning matplotlib numpy pandas python3 random-forest-classifier seaborn sklearn xgboost-classifier

Last synced: 11 Apr 2026

https://github.com/victorowinoke/custmer-segmentation-using-rfm-python-

Customer Segmentation using the Recency, Frequency and Monetary Values

customer-segmentation data data-visualization python3 science time-series-analysis

Last synced: 26 May 2026

https://github.com/allanreda/telco-customer-churn-predictor-app

A web-based machine learning application that predicts customer churn using a logistic regression model. Built with Scikit-Learn for model training, Gradio for the user interface, and deployed on Google Cloud App Engine. The app allows users to input customer data and receive predictions on churn risk to support business decision-making.

app-engine data-visualization deployment google-cloud gradio hyperparameter-tuning logistic-regression machine-learning numpy pandas scikit-learn

Last synced: 16 Apr 2026

https://github.com/27ahmad/heart-disease-diagnostic-eda

This project conducts Exploratory Data Analysis on a dataset related to heart diagnostic disease, aiming to derive valuable insights from the analysis.

data-analysis data-visualization pandas python

Last synced: 06 May 2026

https://github.com/andersoncrs/analisis_exploratorio_de_datos-eda-_rendimiento_estudiantil

Este análisis exploratorio de datos (EDA) realizado sobre el conjunto de datos de rendimiento estudiantil tiene como objetivo identificar y comprender los factores que influyen en el desempeño académico de los estudiantes. A través de la limpieza, transformación y visualización de datos, se busca descubrir patrones y relaciones significatvas.

data-analysis data-exploration data-exploration-and-preprocessing data-visualization seaborn

Last synced: 30 Mar 2025

https://github.com/abhinavbammidi1401/covid-19_analytics

A very comprehensive notebook of statistical models to analyze Covid-19 data and visualization.

analytics covid-19 data-analysis-python data-analytics data-science data-visualization jupyter-notebook predictive-modeling python

Last synced: 19 May 2026

https://github.com/lucertgvby/phat

Graphical PowerShell application designed to help investigators, security analysts, and IT professionals examine email headers for signs of phishing or spoofing. The tool parses headers from .eml and .msg files, highlights important fields, and provides insights into SPF, DKIM, and DMARC results.

data-visualization dimensionality-reduction distributed-computing hashcracking led-matrix-displays mqtt off-chain-compute phala phat raspberry-pi-library single-cell srp-phat unsupervised-learning visualization

Last synced: 21 May 2026

https://github.com/ayaankhan98/covid-19-analysis

Covid-19 Analysis. This repository is a part of AMURoboHack 1.0, Here we tried to visulize the world data of Covid-19. Data Visulization gives an easy way to understand bunch of data. We tried plotting the data over a world map so that users can eaisly get the stats for a conuntry by just hovering the mouse pointer over the country in the world map, we also provided the zooming over the world map to bring a sense of attractiveness and user friendly interface.

covid-19 d3js data-visualization topojson

Last synced: 30 Mar 2025

https://github.com/trimoyee-g/phishing-site-predictor

A phishing site prediction model using scikit-learn's Random Forest Classifier, achieving high accuracy and gaining insights into website characteristics.

data-visualization machine-learning python random-forest-classifier scikit-learn

Last synced: 11 Apr 2026

https://github.com/albertofaraujo/pbi_dashboard_prouni

Analisar os dados referentes ao detalhamento quantitativo das bolsas PROUNI concedidas no ano de 2021.

data-visualization dax-studio power-query powerbi

Last synced: 03 Feb 2026

https://github.com/mitgar14/etl-workshop-1

Workshop #1 (Data Engineer) for the ETL course using Pandas, Matplotlib, SQLAlchemy and Power BI for the creation of the dashboard.

data-engineer data-visualization etl pandas postgresql powerbi python sqlalchemy

Last synced: 11 Apr 2026

https://github.com/sanad343/complete-data-analyst

Data analysis is the process of turning raw data into useful information for decision-making.

data data-visualization datamanipulation eda excel exploratory-data-analysis powerbi python-3 sql tableau

Last synced: 30 Jun 2025

https://github.com/asuquoaa/bar_chart_visualization_with_confidence_intervals_and_interactive_slider

This project visualizes probabilistic data using bar charts with 95% confidence intervals, allowing users to explore deviations from a Value of Interest (V of I) interactively.

data-visualization interactive-visualizations statistics

Last synced: 01 Sep 2025

https://github.com/csoren66/customer-personality-analysis

Predict how different customer segments will respond for a particular product or service.

data-analysis data-visualization python

Last synced: 03 Mar 2025

https://github.com/haonamnguyen/costumer-shopping-trends-analysis

This project analyzes a synthetic dataset of customer shopping behavior to see key trends and insights. Using SQL and Tableau, the analysis focuses on customer demographics, purchase patterns, and preferences, including age distribution, payment methods, shipping types, and top product categories.

data-analysis data-visualization sql tableau

Last synced: 05 Jan 2026

https://github.com/khushi-sabarad/adinsights_dashboard

AdInsights Dashboard: An interactive web dashboard built with Python (Flask, Pandas, Plotly) to visualize and analyze digital advertising performance. Allows filtering by gender, ad type, and location for detailed insights

ad-performance advertising dashboard data-analysis data-visualization flask pandas plotly python web-application

Last synced: 01 May 2026

https://github.com/Gregoritsch3/Exercise_Pandas_1

A Pandas exercise demonstrating the loading, cleaning and reorganization of data in .xlsx or .csv files, descriptive statistics, data visualization, statistical approximation of data with the normal distribution, etc.). Libraries include Pandas, NumPy, Scipy, SymPy, MatplotLib,

data-cleaning data-visualization descriptive-statistics matplotlib numpy pandas scipy sympy

Last synced: 01 May 2025

https://github.com/tsopermon/comparison-ml-algorithms

This repository compares the performance of Adaline, Logistic Regression, and Perceptron models on binary classification tasks using linearly, non-linearly, and marginally separable datasets from the Iris dataset. It includes MATLAB implementations, 10-fold cross-validation, and visualizations of decision boundaries and MSE histories.

adaline binary-classification classification-accuracy cross-validation data-visualization decision-boundaries iris-dataset logistic-regression machine-learning matlab mse neural-networks perceptron

Last synced: 15 Mar 2025

https://github.com/dmarks84/ind_project_obesity-multi-class-classification--kaggle

Independent Project - Kaggle Competition -- I worked on the obesity classification data set as part of a Kaggle Competition of the same name, scoring (for accuracy) above 0.9

classification correlation-analysis cross-validation data-modeling data-visualization dataframes eda gridsearchcv matplotlib multiclass-classification numpy pandas python seaborn sklearn statistics supervised-ml

Last synced: 11 Apr 2026

https://github.com/upes-open/open-cryptocurrency-analysis

A web app to visualise and predict the cryptocurrency’s impact by using Web scraping, data exploration, EDA and Data Visualization.

analysis cryptocurrency data-analysis data-science data-visualization jupyter-notebook streamlite visualization

Last synced: 15 Apr 2025

https://github.com/spear97/montecarlo-python

This was a project for my Programming Language Concepts Class were we were assigned to create a Monty Carlo Simulation using Python.

data-science data-visualization matplotlib-figures matplotlib-python montecarlo pandas-library pandas-python python python-3

Last synced: 23 Mar 2025

https://github.com/yaser-123/energy-consumption-dashboard

A Power BI dashboard to analyze energy consumption for water, gas, and electricity across cities and buildings. Features include interactive charts, drill-down insights, and dynamic filters for easy monitoring and optimization.

dashboard data-analysis data-analytics data-visualization energy-consumption energy-efficiency powerbi

Last synced: 05 Jan 2026

https://github.com/pekiiipy/credit-card-fraud-detection

🔍 Detect credit card fraud efficiently using advanced machine learning techniques, achieving high accuracy rates on a large dataset of transactions.

adasyn anomaly-detection class-imbalance credit-card-fraud data-visualization fraud fraud-detection frauddetection kaggle keras logistic-regression plotly-python postgresql random-forest scikit-learn tensorflow tree-model xgboost

Last synced: 11 Apr 2026

https://github.com/karo23361/toy-store-kpi-power-bi

PowerBI Portfolio Project

csv data data-visualization powerbi

Last synced: 03 Feb 2026

https://github.com/danhnnguyen0606/bitcoin-navigator

Bitcoin Navigator: A data-driven dashboard designed to analyze Bitcoin trends, empowering investors to refine their strategies and identify optimal investment opportunities.

bitcoin btc crypto cryptocurrency data-analysis data-analytics data-science data-visualization investment looker looker-studio

Last synced: 15 Mar 2025

https://github.com/carmendev/covid-19-tracker

Data visualization React.js project deployed with Firebase. Daily statistics about current, recovered and closed cases coming from an API.

data-visualization firebase numeral reactjs

Last synced: 11 Apr 2026

https://github.com/franloza/contratosdemadrid

This project is an interactive web application for exploring and analyzing public contracts in the Community of Madrid. It allows users to search for companies and view their contract details, aiming to promote transparency and facilitate access to public information.

data-visualization duckdb evidence open-data

Last synced: 23 Jun 2026

https://github.com/rakeshdabbikar4/sales-performance-dashboard-powerbi

Interactive Sales Performance Dashboard built using Power BI to analyze revenue, orders, profit, trends, and regional performance.

business-analytics business-intelligence data-analytics data-visualization dax powerbi sales-dashboard

Last synced: 13 Jan 2026

https://github.com/gautam25raj/data-sync

A powerful platform designed to revolutionize the way teams collaborate and visualize data.

chat collaboration data-visualization express material-tailwind mongodb mongoose nextjs nodejs reactjs redux redux-toolkit tableau tableau-dashboard tailwindcss

Last synced: 11 Apr 2026

https://github.com/apelullo/cobalt_health_wellness_platform_ops

Cobalt is a mental health and wellness platform created for Penn Medicine employees that serves as a hub for support services such as therapy, wellness coaching, topic- and population-specific group sessions, and a variety of self-help resources.

academic-research data-cleaning-pipeline data-validation data-visualization decision-support feature-development healthcare-data hipaa key-performance-metrics mental-health-services operations-research product-analytics reporting-pipeline

Last synced: 23 Mar 2025

https://github.com/alfiyafatima09/heuristic_algorithms

This project compares pathfinding algorithms (A*, Greedy Best-First, and Hill Climbing) by visualizing their paths and comparing performance metrics (nodes explored, memory, execution time) on a grid with obstacles.

algorithms data-visualization

Last synced: 20 Jan 2026

https://github.com/sayamalt/taxi-trip-fare-prediction

Successfully created a machine learning model which can accurately predict the fare of a taxi trip based on several features such as trip duration, tip amount, etc.

cross-validation data-exploration-and-preprocessing data-visualization exploratory-data-analysis feature-engineering hyperparameter-optimization machine-learning model-deployment model-selection model-training-and-evaluation regression-modelling

Last synced: 09 Nov 2025

https://github.com/sayamalt/life-expectancy-prediction

Successfully established a machine learning model which can accurately predict the expected life duration of a human being based on several demographic features such as alcohol consumption per capita, average BMI of entire population, etc.

cross-validation data-cleaning-and-preprocessing data-visualization docker end-to-end-pipeline exploratory-data-analysis feature-engineering github-actions-workflow hyperparameter-tuning machine-learning model-deployment model-training-and-evaluation

Last synced: 04 May 2026

https://github.com/sayamalt/employee-attrition-prediction

Successfully established a machine learning model which can accurately predict whether an employee of a given company will leave it in the impending future or not, based on several employee details and employment metrics.

binary-classification continuous-deployment continuous-integration cross-validation data-exploration-and-preprocessing data-visualization exploratory-data-analysis feature-engineering hyperparameter-optimization machine-learning model-deployment model-training-and-evaluation

Last synced: 08 Oct 2025

https://github.com/jiyanshgarg/delhivery-logistics-data-analysis

This project analyzes Delhivery's logistics delivery dataset to understand delivery performance, route efficiency, and operational patterns using data analytics techniques. The analysis focuses on transforming raw segment-level logistics data into meaningful trip-level insights that can help improve delivery efficiency and route planning.

business-insights-and-recommendations data-analysis data-cleaning-and-preprocessing data-visualization exploratory-data-analysis feature-engineering feature-extraction feature-selection hypothesis-testing outlier-detection outlier-treatment

Last synced: 12 Jun 2026

https://github.com/leosimoes/datascienceacademy-powerbi-clinicadebi

Atividades do curso Análise de Dados com Microsoft Power BI e Clínica de BI da Data Science Academy.

dashboards data-analysis data-visualization microsoft-power-bi power-bi

Last synced: 05 Jan 2026

https://github.com/sayamalt/twitter-sentiment-analysis

Successfully established a machine learning model which can accurately classify the sentiment of any particular tweet into either positive, negative or neutral category.

data-visualization exploratory-data-analysis nlp sentiment-analysis supervised-learning text-processing

Last synced: 09 Nov 2025

https://github.com/sayamalt/house-price-prediction

Successfully created a regression model for predicting the price of any house, excluding enormous real estates and mansions, to a significant level of accuracy.

data-visualization exploratory-data-analysis feature-engineering feature-selection machine-learning regression-analysis regression-testing

Last synced: 09 Nov 2025

https://github.com/farhannirzhor/vrinda_store_excel_project

This project is about excel analysis and visualization. In this project, I analyzed Vrinda Store's sales and made an annual sales report

data-analysis data-cleaning data-preprocessing data-visualization microsoft-excel reporting

Last synced: 05 Jan 2026

https://github.com/eslamdyab21/data-visualization-using-matplotlib-and-seaborn

This is the last project in the nanodegree udacity program. it's about data visualization.

data data-analysis data-visualization matplotlib pandas python seaborn udacity udacity-data-analyst-nanodegree

Last synced: 09 May 2026

https://github.com/tolumie/exploratory-data-analytics-projects

Exploratory Data Analytics – A collection of projects covering data exploration, feature engineering, hypothesis testing, and predictive modeling across diverse datasets, including insurance, real estate, laptops, cars, COVID-19, and the Olympics.

data-analysis data-visualization data-wrangling exploratory-data-analysis-eda feature-engineering hypothesis-testing machine-learning matplotlib numpy pandas predictive-modeling python seaborn statistical-analysis

Last synced: 11 Apr 2026

https://github.com/sayamalt/steel-energy-consumption-prediction-using-pyspark

Successfully established a machine learning model using PySpark which can precisely predict the energy consumption of the steel industry, up to an r2 score of approximately 99.5%.

apache-spark big-data-analytics big-data-processing cross-validation data-visualization exploratory-data-analysis hyperparameter-tuning machine-learning model-training-and-evaluation python regression spark sql

Last synced: 10 Mar 2026

https://github.com/thenorthkun/movies-dataset-analysis

Analysis & categorizing of Movies based on Actors, Genres, Gross covered etc 🦸🏼🧜🏼‍♀️🎧

data-analysis data-visualization filtering

Last synced: 23 Mar 2025