An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/himanshubhosale25/ai-insightful-quiz-analytics

This project analyzes student quiz performance data, providing visualizations and AI-generated feedback. It uses FastAPI for the backend, React for the frontend, and OpenAI LLMs to deliver personalized insights and actionable recommendations for students.

data-analysis fastapi openai-api react student-performance

Last synced: 11 Mar 2025

https://github.com/sco1/xbmini-py

Python Toolkit for the GCDC HAM

data-analysis data-visualization python python3

Last synced: 07 May 2025

https://github.com/ljadhav25/data-engineering-poc

This repository contains a beginner-level Data Engineering Proof of Concept (POC) project designed for practice. The objective is to provide hands-on experience with data engineering concepts, including data extraction, transformation, loading (ETL), and basic data analysis. This project is ideal for those looking to build foundational skills in da

data-analysis etl matplotlib numpy pandas python

Last synced: 13 Apr 2026

https://github.com/DCS-training/IntroToStatistics

This is a repository which contains all the materials to be used in the introduction to statistics course. Go to the readme file

data-analysis r rmarkdown statistics

Last synced: 25 Apr 2025

https://github.com/extwiii/datascience-jhu

Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera

biostatistics data-analysis data-science linear-regression multivariate-regression r r-programming toolbox visualization

Last synced: 05 Jul 2025

https://github.com/jameswrigley/laph

A node-based data analysis program.

cpp data-analysis nodes qml

Last synced: 05 Jun 2026

https://github.com/namratha2301/python-dashboard-streamlit

Experimenting with Streamlit. Streamlit app provides an interactive visualization of the best-selling books, showcasing trends, top-selling books, top authors, genre distributions, and sales by decade.

css dashboard data-analysis pandas plotly python seaborn streamlit

Last synced: 05 May 2026

https://github.com/crazy-dot/instagram_user_analytics

Analysis of Popular Social Media Network - Instagram

data-analysis instagram-analytics project-repository trainity

Last synced: 07 Jan 2026

https://github.com/samruddhi3012/tata-data-visualization

Hi! This repo contains the dashboard I created using Tableau for TATA Data Visualization Training!

data-analysis data-visualization tableau tata

Last synced: 07 Jan 2026

https://github.com/mr-chang95/udacity_movie_project

Movie Data Analysis and Visualization Project for Udacity's Data Analyst Program. Using Python in Jupyter Notebook.

data-analysis data-visualization jupyter-notebook movie python

Last synced: 13 Apr 2026

https://github.com/pinedah/sleep-data-analysis-exercise

Análisis de un dataset médico sobre el sueño, explorando duración, calidad y factores relacionados. Incluye limpieza de datos, EDA y visualizaciones con Python (pandas, numpy, matplotlib, seaborn, scipy).

data-analysis data-science escom numpy pandas python school-project scipy

Last synced: 13 Apr 2026

https://github.com/evan-dg31/data-science

Exploratory Data Analysis (EDA), Predictive Modeling (Supervised and Unsupervised), Regression, Classification, Clustering

classification clustering data-analysis data-science data-visualization machine-learning matplotlib numpy pandas python regression-analysis seaborn

Last synced: 13 Apr 2026

https://github.com/ray-chew/pycsam

pyCSAM is a robust approach for approximating geodesic subgrid-scale orographic spectra with applications to weather forecasting and broader data analysis

data-analysis gmted icon-model merit-dem orographic spectral-analysis topography weather-forecast

Last synced: 28 Feb 2025

https://github.com/bala-1409/tableau-visualization-viz.-project

This repository contains Visualization Projects which is visualized through Tableau Software, by using the visualization we can gain multiple insights and strategies which helps to develop the business for gaining high profit margins and also it provides social values in some cases to calculate damages and intensity by calamities.

dashboard data-analysis data-science data-visualization exploratory-data-analysis tableau tableau-dashboards tableau-public visualization

Last synced: 04 Feb 2026

https://github.com/badranalyst/tips-dataset-analysis-dashboard-with-streamlit-and-plotly

Interactive Streamlit dashboard analyzing the Seaborn 'tips' dataset, which records information on restaurant bills, including total bill amounts, tips, customer demographics (e.g., gender, smoking status), and dining details (e.g., day, time). Visualized with Plotly for insights into tipping patterns.

data-analysis data-analytics data-visualization dataset eda exploratory-data-analysis matplotlib matplotlib-pyplot numpy pandas plotly python seaborn streamlit

Last synced: 13 Apr 2026

https://github.com/lijesh010/covid-19_global_analytics_power_bi_project

This repository is a data visualization project that offers an in-depth analysis of the Covid-19 pandemic using Microsoft Power BI. This interactive dashboard provides valuable insights into key metrics related to Covid-19 cases, deaths, recoveries, and more, helping users understand the global impact of the pandemic.

dashboard data-analysis data-visualization powerbi report

Last synced: 08 Jan 2026

https://github.com/singhrdeep/croppilot

CropPilot is a lightweight, Python-based command-line tool designed to help small-scale farmers, gardeners, and students manage crop data, track profits, and explore sustainable practices. Built for usability and extensibility.

agriculture data-analysis farm-management open-source python

Last synced: 25 Apr 2025

https://github.com/shivamsharma32/customer-churn-analysis-power-bi-

This project is about analyzing and visualizing customer churn data using Power BI. Customer churn is the percentage of customers who stop doing business with a company over a given period of time. It is an important metric for businesses to understand why customers leave and how to retain them.

data-analysis dataanalytics datavisualization powerbi

Last synced: 15 Jan 2026

https://github.com/nmelgar/lego_my_data

Data visualization project to sell LEGO bulks.

csv data-analysis data-visualization data-viz google-sheets tableau

Last synced: 08 Jan 2026

https://github.com/kittonn/data-analysis-freecodecamp

freecodecamp - data analysis projects.

data-analysis freecodecamp

Last synced: 05 Apr 2025

https://github.com/anthonytlei/graphsql

Lightweight SQL-to-GraphQL connector for querying GraphQL endpoints using SQL syntax.

connector data-analysis dbapi graphql graphsql python sql sqlalchemy superset

Last synced: 09 Apr 2026

https://github.com/mxagar/data_science_udacity

My personal notes, code and projects of the Udacity Data Science Nanodegree.

dashboard data-analysis data-engineering data-science machine-learning-pipelines

Last synced: 09 Apr 2025

https://github.com/prady2309/car-price-prediction

Multiple Linear Regression Project

data-analysis data-science machine-learning python

Last synced: 20 May 2026

https://github.com/pkjjoshi/behind-the-menu-uncovering-insights-from-restaurant-data

Discover hidden patterns in dining data — from popular cuisine pairings to geographic restaurant clusters

data-analysis data-visualization insights jupyter-notebook pandas python restaurant-data

Last synced: 05 Jul 2025

https://github.com/aravindnathan02/bi-projects

Data Analysis and Visualization projects involving only BI tools (Power BI, Tableau, MS Excel).

data-analysis data-visualisation ms-excel powerbi tableau

Last synced: 08 Jan 2026

https://github.com/joaquinmoron/airbnb-eda-python

EDA de Airbnb — limpieza, exploración y visualización en Python (pandas, matplotlib, seaborn).

airbnb data-analysis eda matplotlib pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/mchirico/go_slicestore

Pull Data from Slice Store

data-analysis go ibm

Last synced: 16 Mar 2025

https://github.com/marianamartiyns/inep-educationperfomance

Data collection, processing, exploratory analysis, and predictive modeling of school performance rates using datasets from INEP.

data-analysis data-cleaning data-science inep predictive-modeling pyhton web-scraping

Last synced: 16 Mar 2025

https://github.com/marianamartiyns/rfm-cluster-analysis

Customer behavior and sales analysis, including data cleaning, RFM calculation, churn analysis and customer clustering.

cluster-analysis data-analysis data-cleaning data-visualization pyhton

Last synced: 16 Mar 2025

https://github.com/marina-gal/elderly-care-ranking

Data analysis and scoring model for elderly care homes, including data cleaning, transformation, 0–100 scoring, and ranking across multiple quality dimensions.

data-analysis excel ranking

Last synced: 30 May 2026

https://github.com/luminati-io/Walmart-dataset-samples

A sample dataset of over 1000 Walmart products, extracted using the Bright Data API, ideal for consumer market insights and competitor analysis.

api data-analysis dataset walmart walmart-scraper web-scraping

Last synced: 09 Apr 2025

https://github.com/luminati-io/Shopee-dataset-samples

A sample dataset of over 1000 Shopee products, extracted using the Bright Data API, ideal for pricing optimization, gap analysis, and market strategy refinement..

api data-analysis data-mining datasets products shopee web-scraping

Last synced: 09 Apr 2025

https://github.com/chaganti-reddy/weather-prediction-australia

Creating a fully-automated system that can use today's weather data for a given location to predict whether it will rain at the location tomorrow.

data-analysis logistic-regression machine-learning prediction-model python3

Last synced: 13 Apr 2026

https://github.com/deliprofesor/virtual-reality-in-education-impact-analysis-and-insights

This project examines the impact of Virtual Reality (VR) on education, focusing on its effects on student engagement, learning outcomes, and creativity. It uses data analysis techniques like descriptive statistics, correlation analysis, and clustering to assess VR's effectiveness in enhancing learning.

clustering data data-analysis data-science data-visualization exploratory-data-analysis hypothesis-testing machine-learning python regression-analysis virtual-reality

Last synced: 14 Jun 2025

https://github.com/khushi-sabarad/8-week-sql-challenge

Case studies' solutions for the #8WeekSQLChallenge by Danny Ma

8weeksqlchallenge case-study data-analysis mysql sql

Last synced: 06 Sep 2025

https://github.com/saifalibaig/covid-19-infection-rate-analysis-using-python

Analysis of Covid-19 Infection rate and the world happiness report to identify if there is any relationship between infection rate and happiness

data-analysis data-visualization jupyter-notebook numpy pandas python3 sns

Last synced: 18 Apr 2026

https://github.com/shoebjoarder/superstore

A Dash app to analyze Superstore dataset.

dashboard data-analysis data-visualization python-3

Last synced: 02 Apr 2025

https://github.com/shellynagar27/good-cabs-data-analysis-project

This project is part of CodeBasics Challenge #13, where the goal was to provide actionable insights to the Chief of Operations at Goodcabs, a cab service provider in tier-2 cities of India. The project focused on analyzing key metrics like trip volume, repeat passenger rate, and passenger satisfaction.

critical-thinking data-analysis data-visualization excel exploratory-data-analysis power-bi presentation problem-solving sql storytelling

Last synced: 25 Jan 2026

https://github.com/tj2904/lfb-callout-analysis

An investigation into London Fire Brigade's callout data.

data-analysis decsion-tree kmeans lfb-incidents london-fire-brigade pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/shellynagar27/business-insights-360-project

A comprehensive Dashboard which provides better understanding of the business's market standing, key focus areas for optimization, underperforming customers, and year-wise financial insights, aiding in better inventory planning and performance tracking. Further it can be used in answering n number of why questions based on the situations.

dashboard data-analysis data-visualization dax-languague dax-studio excel performance-optimization power-bi reporting sql storage-manager

Last synced: 27 Jan 2026

https://github.com/shellynagar27/candy-market-share-analysis

Candy Market Share Analysis explores confectionery sales data using Power BI, Python, and Power Query. It uncovers key market trends, top-selling candies, manufacturer performance, and packaging preferences to support data-driven decision-making for industry researchers.

critical-thinking data-analysis data-visualization exploratory-data-analysis powerbi powerquery problem-solving sales-analysis

Last synced: 03 Feb 2026

https://github.com/alanjamlu34/bike-dataset

Ini adalah tugas akhir dari kelas Dicoding Menjadi Data Analist

data-analysis streamlit-dashboard

Last synced: 19 Oct 2025

https://github.com/ndiplacide7/r-project

Explore diverse data analysis techniques using R programming combined with advanced machine learning algorithms to uncover insights and create powerful predictive models.

data-analysis data-visualization machine-learning-algorithms r

Last synced: 25 Mar 2025

https://github.com/ravi-prakash1907/covid-19-china

A data-science research work to understand the growth rate of the novel Coronavirus.

china coronavirus covid-19 data-analysis data-mining data-science mathematical-modelling project r research research-paper

Last synced: 06 Sep 2025

https://github.com/wadeChriestenson/Main_Application

A Django application to host my personal resume.

data-analysis data-visualization django plotly python ui-design

Last synced: 11 Mar 2025

https://github.com/paul0vinicius/ad2

Repositório da disciplina de Análise de Dados 2 (Data Analysis II)

data-analysis data-science

Last synced: 08 Jan 2026

https://github.com/shibbir24/a-data-driven-approach-to-food-security-and-supermarket-accessibility

A Data-Driven Approach to Food Security and Supermarket Accessibility

data-analysis matplotlib numpy pandas python3 seaborn

Last synced: 13 Apr 2026

https://github.com/pawlo77/smarty

End-to-End Data Science tool

data-analysis data-processing pandas pipeline

Last synced: 08 May 2026

https://github.com/krzysikd/uber_fare_prediction

Predicting uber fares using advanced machine learning models and feature engineering techniques

data-analysis data-processing eda hyperparameter-tuning jupyter machine-learning regression-models

Last synced: 02 Apr 2025

https://github.com/dbriane208/python-for-data-science

Machine Learning and Data Science repository. Love crafting Machine Learning models.

data-analysis data-science data-visualization machine-learning numpy pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/rishitabansal9/adult-census-income-prediction

This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.

data data-analysis data-science feature-engineering random-forest-classifier

Last synced: 25 Mar 2025

https://github.com/diligencefrozen/dcinside-data

Analyzing the Dcinside Frozen Gallery Dataset. #디시

data-analysis dataset

Last synced: 30 May 2026

https://github.com/meokullu/prefill

PreFill adds desired characters onto output values to increase their legibility.

alignment data data-analysis data-engineering data-science legibility

Last synced: 17 Jan 2026

https://github.com/1401dev/customer-lifetime-value-prediction

A data science project leveraging Python and Scikit-Learn to build predictive models that estimate customer lifetime value (CLV). Includes data cleaning, feature engineering, and model selection to identify key drivers of CLV, supporting strategic decision-making in customer retention and marketing.

clv clv-analysis customer-retention data-analysis dataprocessing feature-engineering machine-learning marketing-analytics predictive-modeling python regression-analysis scikit-learn

Last synced: 06 May 2026

https://github.com/tatilimongi/first_python_project

Este repositório contém um estudo de caso de automação de planilhas em Python para análise de vendas de carros por fabricante ao longo dos anos

data-analysis email-sending file-manipulation graphical-visualization spreadsheet-automation

Last synced: 26 Mar 2025

https://github.com/fer-aguirre/covid19-venezuela

Análisis de datos de muertes por covid-19 en Venezuela

covid-19 data-analysis dataviz line-chart

Last synced: 09 Apr 2025

https://github.com/fbarffmann/vba-challenge

Built an Excel VBA script to automate stock market analysis across multiple years. Programmatically calculated and visualized key financial metrics, reducing manual reporting time and improving data accuracy.

automation data-analysis excel excel-vba financial-analysis reporting stock-market vba

Last synced: 04 Feb 2026

https://github.com/vatshayan/pokemon-analysis

Visualization, Analysis & Predicting the accuracy of finding Pokemon power, attack & speed through Machine Learning

artificial-intelligence data data-analysis data-science data-visualization dataset machine-learning machine-learning-algorithms pokemon scikit-learn

Last synced: 30 May 2026

https://github.com/sadia-khan13/data-preprocessing

Welcome to the Data preprocessing Repository! This repository is dedicated to showcase the comprehensive resources and implementations related to Data Preprocessing using Python and Jupyter Notebook.

artificial-intelligence data-analysis data-mining data-preprocessing data-science jupyter-notebook matplotlib numpy pandas python seaborn-python sklearn

Last synced: 11 Apr 2026

https://github.com/isaacmaffeis/imad-2023

Model Identification and Data Analysis (IMAD) | University course

data data-analysis data-science model model-identification

Last synced: 09 May 2026

https://github.com/jkaardal/csvnav

A memory-efficient python class for navigating large CSV/text files.

csv data-analysis data-science machine-learning memory-management

Last synced: 14 Jan 2026

https://github.com/marielachirinosr/pandas-weather-project

Pandas Weather Data. Explore straightforward Python scripts for weather information analysis.

data-analysis pandas python

Last synced: 29 Apr 2026

https://github.com/abhisek-13/whatsapp-chat-analyzer

The WhatsApp Chat Analyzer is a data analysis project that provides insights into WhatsApp chats. It analyzes chat data to show metrics like the number of lines, most used letter, chatting duration, media files shared, most used emojis, and group member activity. The results are displayed on a user-friendly dashboard built with Streamlit.

data-analysis data-mining data-visualization eda machine-learning machine-learning-algorithms matplotlib numpy pandas python seaborn sklearn

Last synced: 13 Apr 2026

https://github.com/fbarffmann/car_price_prediction

Predicted used car prices with a Random Forest model (R² = 0.96) using Python. Analyzed 2,000+ listings and visualized trends with Tableau.

car-price-prediction data-analysis machine-learning pandas python random-forest regression sklearn tableau

Last synced: 13 Apr 2026

https://github.com/marina-gal/sql-business-questions

A collection of SQL queries designed to strengthen analytical problem-solving skills using the AdventureWorks2019 sample database. tested and optimized in SQL Server Management Studio (SSMS).

adventureworks data-analysis data-analyst interview-preparation learning microsoft-sql-server practice sql sql-queries

Last synced: 30 May 2026

https://github.com/jakubteichman/bullbozer_price_prediction_ml_project

A bulldozer price estimatior from Kaggle competition dataset

data-analysis data-science estimation machine-learning prediction

Last synced: 06 Sep 2025

https://github.com/wsu-carbon-lab/ezfit

Fitting in python made dead simple

data-analysis experimental-physics fitting pandas-accessor

Last synced: 14 Jun 2025

https://github.com/hyperentangledqubit/shellplot

shellplot -- Generate plot(s) directly from terminal via matplotlib or ggplot2 (plotnine)!

data-analysis ggplot2 graphics matplotlib plotnine plotting pyplot terminal

Last synced: 10 May 2026

https://github.com/1adityakadam/carnegie_classifications_website

A comprehensive data analytics platform analyzing 50+ years of U.S. higher education trends through interactive visualizations and historical institution tracking.

css data-analysis html javascript python ui-design web-development

Last synced: 13 Apr 2026

https://github.com/shruthin4/news-articles-classification

Classifying News Articles using Machine Learning and NLP techniques.. Built an end-to-end text classification pipeline using TF-IDF vectorization and models like Logistic Regression and SVM. Includes exploratory data analysis, model evaluation, and deployment-ready artifacts.

data-analysis data-science logistic-regression machine-learning model news-classification nlp python scikit-learn svm tf-idf-vectorization

Last synced: 13 Apr 2026

https://github.com/grandechowhiskey/fcc-data_analysis-projects

A collection of projects completed as part of the FreeCodeCamp "Data Analysis with Python" certification. These projects cover statistical calculations, data visualization, and trend analysis using real-world datasets.

data-analysis data-visualization matplotlib pandas python3 scikit-learn seaborn

Last synced: 01 May 2026

https://github.com/balajimohan18/milk-production-time-series-forecasting-datascience-project

This project uses time series forecasting to predict future milk production. The data used in this project is monthly milk production data from January 1962 to December 1975. The ARIMA (autoregressive integrated moving average) model is used to forecast the milk production. The model is evaluated using various metric.

acf adf data-analysis data-cleaning data-science data-visualization eda exploratory-data-analysis machine-learning pacf seasonality time-series trends

Last synced: 30 May 2026

https://github.com/srinibas-masanta/electric-vehicle-analysis-dashboard

This repository features an interactive Tableau dashboard that visualizes electric vehicle (EV) adoption trends in the U.S. 🚗⚡ Explore EV growth, top manufacturers, regional distribution, and the impact of incentives—all in one dynamic view. 📊 Use filters to dive deeper into the data and uncover key insights! 🚀

dashboards data-analysis data-visualization tableau

Last synced: 15 Jan 2026

https://github.com/srinibas-masanta/olympics-data-analysis

The Olympics Analysis project explores Olympic data to uncover trends in athlete performance, medal distribution, and participation across countries and demographics. By leveraging detailed datasets, it provides insights into the evolution of the Games, highlighting key patterns and disparities over time.

data-analysis data-science data-visualization olympics olympics-visualization

Last synced: 02 Apr 2025

https://github.com/srinibas-masanta/zomato-customer-and-restaurant-analysis

This repository contains a comprehensive analysis of Zomato's platform, focusing on various aspects of customer behavior, restaurant performance, and market trends. The analysis leverages data-driven insights to answer key questions that can guide business strategies, enhance customer satisfaction, and optimize operational efficiency.

business-analytics data-analysis data-science data-visualization

Last synced: 02 Apr 2025

https://github.com/beyzabasarir/brazilian-e-commerce-analysis

Brazilian E-Commerce Dataset By Olist PostgreSQL Analysis

data-analysis data-visualization sql

Last synced: 08 Jan 2026

https://github.com/deypadma2020/sql_project

✏️ A collection of practical SQL case studies and solutions exploring real-world business scenarios: car showroom analysis, esports tournament, customer insights, finance analysis, pricing strategy, and marketing analytics.

business-intelligence case-study data-analysis database mysql queries sql

Last synced: 30 May 2026

https://github.com/deypadma2020/dataanalysis-mlalgo

Practice repository for data analysis, feature engineering, statistics, web scraping, and building ML model pipelines in Python.

data-analysis eda feature-engineering machine-learning-algorithms ml-pipeline statistics web-scraping

Last synced: 30 May 2026

https://github.com/abhijeet107/final-project

Final project summation INTERNSHIP PROJECTS (2 WEEKS)

data-analysis data-cleaning-and-preprocessing excel mysql-database python tableau-public

Last synced: 23 Feb 2026

https://github.com/codeslash21/wrangle-twitter-archive

Wrangle Twitter Archive WeRateDog. WeRateDog has 8M followers and they rate the dogs with funny comments and unique rating system. Also use dog-breed classifier to predict dog's breed in the tweets.

data-analysis data-wrangling neural-networkt twitter-api twitter-archive

Last synced: 10 Apr 2025