An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/badranalyst/student-tests-data-analysis-application

Python-based analysis of student test scores in math, reading, and writing, examining correlations with parental education, lunch type, and test preparation. Includes data cleaning, visualization, and statistical insights into factors influencing academic performance.

data-analysis data-visualization dataset matplotlib numpy pandas python sklearn

Last synced: 05 May 2026

https://github.com/yandexdataschool/ml-sweights-experiments

Experiments for the "Machine Learning on data with sPlot background subtraction" paper

data-analysis high-energy-physics machine-learning statistics

Last synced: 15 May 2025

https://github.com/andersoncrs/analisis_exploratorio_de_datos-eda-_rendimiento_estudiantil

Este análisis exploratorio de datos (EDA) realizado sobre el conjunto de datos de rendimiento estudiantil tiene como objetivo identificar y comprender los factores que influyen en el desempeño académico de los estudiantes. A través de la limpieza, transformación y visualización de datos, se busca descubrir patrones y relaciones significatvas.

data-analysis data-exploration data-exploration-and-preprocessing data-visualization seaborn

Last synced: 30 Mar 2025

https://github.com/andersoncrs/arboles_de_decision_calidad_del_vino

Contiene un análisis detallado de la calidad del vino utilizando un modelo de clasificación basado en árboles de decisión. Incluye la exploración de datos, detección y manejo de valores atípicos, análisis Univariado y Bivariado, y la creación y evaluación de un modelo predictivo. El objetivo principal es predecir la calidad del vino.

data-analysis data-science data-visualization machine-learning matplotlib seaborn sklearn tree-decision

Last synced: 20 May 2026

https://github.com/malucor/livros

Programa em Python para fazer uma análise de dados sobre livros, a partir de um arquivo Excel.

analise-de-dados book books bookshelf data-analysis ipynb jupyter-notebook livro livros python

Last synced: 16 May 2026

https://github.com/gaaniruddha/mphil

This repository contains a copy of my final MPhil presentation and panel report.

data-analysis gpu-imager radio-astronomy

Last synced: 03 Mar 2026

https://github.com/rita94105/ethereum-fraud-detection

This project focuses on detecting fraudulent transactions in the Ethereum network using both traditional machine learning models and deep learning techniques. By analyzing transaction attributes and interaction patterns, we aim to develop an effective fraud detection model.

data-analysis deep-learning ethereum fraud-detection machine-learning

Last synced: 01 May 2026

https://github.com/jatin-s16/hr_mysql_powerbi

This repository contains raw HR data along with key business questions. I performed data cleaning using MySQL queries and wrote analytical queries to extract meaningful insights. The results were then visualised using Power BI to enhance business understanding.

data-analysis data-science data-visualization mysql powerbi

Last synced: 29 May 2026

https://github.com/jedrzej-wydra/data-analysis-pro

Professional Data Analyst Exam by DataCamp

data-analysis datacamp r

Last synced: 23 Mar 2025

https://github.com/mohit01chugh/edu_sql_analysis

SQL queries used to analyze student data.

data-analysis database education plpgsql postgresql sql

Last synced: 17 May 2026

https://github.com/khushi-sabarad/adinsights_dashboard

AdInsights Dashboard: An interactive web dashboard built with Python (Flask, Pandas, Plotly) to visualize and analyze digital advertising performance. Allows filtering by gender, ad type, and location for detailed insights

ad-performance advertising dashboard data-analysis data-visualization flask pandas plotly python web-application

Last synced: 01 May 2026

https://github.com/takshshah-16/spotify_eda

Spotify data analytics and advanced querying

data-analysis eda pgadmin4 postgresql

Last synced: 30 Oct 2025

https://github.com/vedantshi/coffee-sales-dashboard

This project analyzes coffee sales data using Excel, featuring data cleaning, trend analysis, and an interactive dashboard. Key insights highlight top-performing products, regional sales trends, and seasonal patterns. Recommendations focus on marketing strategies and inventory optimization. Future plans include Power BI integration for visuals.

business-insights data-analysis data-visualization excel-dashboard pivot-tables sales-trends

Last synced: 05 Jan 2026

https://github.com/benami171/ml_knn_decision-trees

A ml implementation comparing Decision Trees and k-Nearest Neighbors (k-NN) algorithms for Iris flower classification. Features comprehensive analysis of different approaches including brute-force and entropy-based decision trees, along with k-NN using multiple distance metrics.

classification cross-validation data-analysis decision-trees iris-dataset k-nearest-neighbours machine-learning nearest-neighbors python

Last synced: 30 Jun 2025

https://github.com/mahmoudwal27/powerbi-projects-for-data-analysis

This project leverages Power BI for data visualization, DAX for custom calculations, and integrates SQL and Excel for data preprocessing, analysis, and reporting, enabling dynamic and interactive insights.

data-analysis data-analysis-project data-analytics-project project

Last synced: 07 Mar 2026

https://github.com/first-coding/smart_analysis

Smart Analysis is an AI-powered data analysis tool that leverages large language models (LLMs) to generate SQL queries from natural language prompts. Upload CSV files, explore the data schema, and retrieve insights with ease. The system ensures error correction in SQL queries, delivering detailed reports and visualizations in a streamlined workflow

data-analysis llm openai prompt-engineering python

Last synced: 08 Mar 2025

https://github.com/andrii-zapukhlyi/otomoto_visualization

Scraping, data visualization, and building a price prediction model with data from the car classifieds website otomoto.pl

data-analysis machine-learning r scraping statistics visualization

Last synced: 26 Jul 2025

https://github.com/upes-open/open-cryptocurrency-analysis

A web app to visualise and predict the cryptocurrency’s impact by using Web scraping, data exploration, EDA and Data Visualization.

analysis cryptocurrency data-analysis data-science data-visualization jupyter-notebook streamlite visualization

Last synced: 15 Apr 2025

https://github.com/chitranjan806/greyatom_learning_repo

A Collection of Projects, Tasks and Challenges as part of Data Science Masters - Transition Program at GreyAtom.

data-analysis data-science greyatom python3

Last synced: 29 Jun 2026

https://github.com/chokzb/covid19_vaccination_analysis

An EDA project examining global COVID-19 vaccination progress. The notebook investigates vaccination trends by country, daily vaccination rates, timeline patterns, and dose distribution. The project includes visualisations created with Matplotlib, Seaborn, and Plotly.

covid-19 data-analysis data-visualization jupyter-notebook matplotlib numpy pandas plotly python seaborn vaccination

Last synced: 07 May 2026

https://github.com/vidyadnina/cyclistic-sql-tableau-project

Trip data analysis for a bike-sharing service company using SQL and Tableau.

bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql

Last synced: 02 Jan 2026

https://github.com/danhnnguyen0606/bitcoin-navigator

Bitcoin Navigator: A data-driven dashboard designed to analyze Bitcoin trends, empowering investors to refine their strategies and identify optimal investment opportunities.

bitcoin btc crypto cryptocurrency data-analysis data-analytics data-science data-visualization investment looker looker-studio

Last synced: 15 Mar 2025

https://github.com/mizzy/tweetduck

Twitter Archive to DuckDB Importer - Extract and import Twitter archive data (2025 format) into DuckDB for analysis

archive cli data-analysis duckdb golang twitter

Last synced: 28 Jun 2026

https://github.com/hanzopgp/lolanalysis

League Of Legends game data engineering, analysis, visualization and machine learning. Business intelligence project.

data-analysis data-cleaning data-engineering data-visualization dataiku deep-learning etl machine-learning scraping university

Last synced: 27 May 2026

https://github.com/syarwinaaa09/hypothesis-testing-with-mens-and-womens-soccer-matches

a data-driven exploration of international men's and women's football (soccer) match results using Python

data-analysis data-visualization football jupyter-notebook men-vs-women pandas python soccer sports-analytics visualization

Last synced: 05 May 2026

https://github.com/dacrol/filterdataset

Filters a dataset based on attributes

data-analysis dataset deep-learning machine-learning python python3

Last synced: 25 Jul 2025

https://github.com/masum184e/exploratory_data_analysis_projects

This space to showcase my journey in exploring various datasets, uncovering patterns, and extracting meaningful insights. Each project highlights different aspects of EDA, demonstrating techniques and tools that are essential for making sense of data.

data-analysis data-analysis-projects data-science data-science-projects eda eda-projects exploratory-data-analysis exploratory-data-analysis-projects

Last synced: 31 Mar 2025

https://github.com/sahilmb/social-media-data-analysis

A social media platform chat analysis system built using python for root level analysis of huge blocks of text

analytics data-analysis python3 streamlit

Last synced: 31 Mar 2025

https://github.com/matheusafonseca/c111

Este repositório é dedicado ao armazenamento e organização dos códigos desenvolvidos na disciplina C111 - Análise de Dados, oferecida pelo Instituto Nacional de Telecomunicações (INATEL).

data-analysis matplotlib numpy pandas python

Last synced: 06 May 2026

https://github.com/totonga/ods-exd-api-box

Helper package to build ASAM ODS EXD API grpc plugins.

asam data-analysis grpc grpc-server ods plugin python

Last synced: 03 Feb 2026

https://github.com/jiyanshgarg/delhivery-logistics-data-analysis

This project analyzes Delhivery's logistics delivery dataset to understand delivery performance, route efficiency, and operational patterns using data analytics techniques. The analysis focuses on transforming raw segment-level logistics data into meaningful trip-level insights that can help improve delivery efficiency and route planning.

business-insights-and-recommendations data-analysis data-cleaning-and-preprocessing data-visualization exploratory-data-analysis feature-engineering feature-extraction feature-selection hypothesis-testing outlier-detection outlier-treatment

Last synced: 12 Jun 2026

https://github.com/farhannirzhor/vrinda_store_excel_project

This project is about excel analysis and visualization. In this project, I analyzed Vrinda Store's sales and made an annual sales report

data-analysis data-cleaning data-preprocessing data-visualization microsoft-excel reporting

Last synced: 05 Jan 2026

https://github.com/amlanmohanty1/genai-data-analysis-report-generator

Generating data analysis and EDA reports from CSV files using Generative AI - Langchain, Llama, Groq.

ai data-analysis data-science flask generative-ai groq langchain llama3 llm prompt-engineering python

Last synced: 28 Jan 2026

https://github.com/eslamdyab21/data-visualization-using-matplotlib-and-seaborn

This is the last project in the nanodegree udacity program. it's about data visualization.

data data-analysis data-visualization matplotlib pandas python seaborn udacity udacity-data-analyst-nanodegree

Last synced: 09 May 2026

https://github.com/vyjayanthipolapragada/marketing_statistical_analysis

Statistical analysis of customer data and their impact on the sales of products based on marketing campaigns

customer-data data-analysis dataframes marketing matplotlib numpy pandas python seaborn statistical-analysis

Last synced: 11 Apr 2026

https://github.com/tolumie/exploratory-data-analytics-projects

Exploratory Data Analytics – A collection of projects covering data exploration, feature engineering, hypothesis testing, and predictive modeling across diverse datasets, including insurance, real estate, laptops, cars, COVID-19, and the Olympics.

data-analysis data-visualization data-wrangling exploratory-data-analysis-eda feature-engineering hypothesis-testing machine-learning matplotlib numpy pandas predictive-modeling python seaborn statistical-analysis

Last synced: 11 Apr 2026

https://github.com/vanshuchaudhary/retail-sale

project uses MySQL to analyze retail sales data, focusing on customer behavior, sales trends, and product performance. The dataset includes transactions, customer demographics, and purchase details, helping businesses optimize strategies. Key Insights: 📊 Revenue Analysis – Total sales, top-spending customers 📅 Sales Trends

business-intelligence customer-behavior customer-behavior-analysis data-analysis mysql predictive-analytics retail-analytics sales-analysis sql-queries

Last synced: 23 Mar 2025

https://github.com/thenorthkun/movies-dataset-analysis

Analysis & categorizing of Movies based on Actors, Genres, Gross covered etc 🦸🏼🧜🏼‍♀️🎧

data-analysis data-visualization filtering

Last synced: 23 Mar 2025

https://github.com/chardyb/prob-and-stats-bmi6106

A repository for Spring 2025 BMI 6106: Statistics and Probability. This repository contains coursework, code examples, and projects exploring statistical methods and probabilistic models in biomedical informatics.

biomedical-informatics data-analysis data-science probability r statistical-modeling

Last synced: 02 Sep 2025

https://github.com/nimomach/skateboarding-in-olympics

Skateboarding made its debut in Olympics at the 2020 Summer Olympics. This is a dashboard focused on "Skateboarding in the Olympics" representing a comprehensive overview of the sport's performance, popularity, and key metrics during the Olympic Games.

data-analysis data-visualization olympics paris skateboarding tokyo

Last synced: 10 Mar 2026

https://github.com/jbalooshie/movies-etl

Exercise working with movie datasets from Kaggle and Wikipedia. Python is used to extract, clean, and combine the data, and then it is loaded into a postgreSQL database.

data-analysis data-science jupyter-notebook numpy pandas postgresql postgresql-database python sqlalchemy

Last synced: 11 Apr 2026

https://github.com/shellynagar27/mobile-sales-analysis

Analyzed 2024 mobile sales data to uncover product trends, customer behavior, and regional insights using Power BI dashboards and structured data modeling.

cleaning-data data-analysis data-visualization dax eda figma modelling powerbi powerquery storytelling wireframe

Last synced: 16 May 2025

https://github.com/geoninja/reddit_data_analysis

Data analysis application presented at the 2016 NTC (Non-profit Technology Conference) in San Jose, CA.

data-analysis python reddit-data-analysis text-analysis

Last synced: 03 May 2026

https://github.com/swapnil-jain/tailored-tomes

Web application which shows Top 50 books of all time & recommends similar books if a book name is provided.

book bookrecommendsystem books bootstrap3 cosine-similarity data-analysis html machine-learning python

Last synced: 20 Jan 2026

https://github.com/anderson-andre-p/exploratory-data-analysis.roller-coaster

This repository contains an exploratory data analysis (EDA) project focused on roller coasters. The project involved organizing, cleaning, and visualizing the data to gain insights into roller coasters' characteristics and performance.

data-analysis eda exploratory-data-analysis exploratory-data-visualizations notebook

Last synced: 15 Mar 2025

https://github.com/aya-jafar/python

Practice files & exercises during the journey of Python leaning 🐍

data-analysis dl exercises ml

Last synced: 16 May 2025

https://github.com/agrdatasci/climmob-analysis

Workflow for data analysis applied on ClimMob.net

citizen-science data-analysis workflow

Last synced: 24 Jun 2025

https://github.com/trim0500/fe-stats-classifier

An experiment to create a machine learning model via PyTorch to classify select Fire Emblem unit base stat distributions.

creational-patterns data-analysis data-science data-visualization design-patterns excel jupyter jupyter-notebook matplotlib-pyplot numpy pandas python python-modules python3 pytorch singleton

Last synced: 11 Apr 2026

https://github.com/bagusperdanay7/fcc-da-mean-variance-standard-deviation-calculator

One of Data Analysis with Python (freecodecamp) task, created a Mean Variance Standard Deviation Calculator.

data-analysis freecodecamp-project numpy python

Last synced: 06 May 2026

https://github.com/deeksha-dhawan/pizza-outlet-analysis-using-sql

This project analyzes pizza sales data to gain insights into customer behavior and revenue patterns. Key analyses include customer insights, popular pizza types and sizes, revenue generation, and order trends. The findings help optimize menu offerings, staffing, and marketing strategies to boost overall business performance.

coding-challenge data-analysis data-science microsoft my portfolio-project programming project projects sql sql-analysis sql-project sqlproject sqlserver

Last synced: 23 Mar 2025

https://github.com/fatihilhan42/eda-spacex-launches-falcon9-and-falcon-heavy

In this project, we analyze the space flight data of Spacex space research company Falcon 9 rocket.

data-analysis data-science data-visualization eda elonmusk spacex

Last synced: 23 Mar 2025

https://github.com/wittyicon29/kritika-iit-b-2023

Seletcion task for the summer projects of Kritika IIT-B

data data-analysis data-science

Last synced: 15 Mar 2025

https://github.com/BAMresearch/Utah-SAXS-Tools

The Utah SAXS Tools (USToo), adapted for Python 3, originally by David P. Goldenberg, 2009-2012

data-analysis saxs small-angle-scattering small-angle-xray-scattering

Last synced: 16 Jan 2026

https://github.com/nikhil-donthusaram/heartdiseaseprediction

Heart Disease Prediction App is a machine learning web application that predicts the likelihood of heart disease based on user medical inputs. Built using a Decision Tree Classifier and deployed with Streamlit for an interactive, user-friendly interface.

data-analysis descision-tree joblib jupyter-notebook machine-learning matplotlib numpy pandas python3 seaborn sklearn streamlit vscode

Last synced: 11 Apr 2026

https://github.com/loginchik/mid_contracts

Анализ контрактов государственных закупок МИДа РФ

data-analysis dataset pandas python

Last synced: 17 Apr 2025

https://github.com/ejw-data/pandas-school

Analysis of school data with Pandas

data-analysis pandas python

Last synced: 08 May 2026

https://github.com/mohammad-malik/covid-visualizations-d3

This project provides a dashboard with five different perspectives on the pandemic, from patient-infection relationships to regional trends and hierarchical distributions. This was developed as part of a project for the course Data Analysis and Visualization (DS3001).

covid-19 d3 d3-visualization d3js data data-analysis data-analytics data-science visualization

Last synced: 28 May 2026

https://github.com/annnieglez/nlp-stock-market-and-news

This project focuses on detecting fake news from news headlines using advanced Natural Language Processing (NLP) techniques. It combines sentiment analysis with news headlines embeddings, generated from Hugging Face transformer models, to train a binary classification model that distinguishes between real and fake news.

classification-model data-analysis embeddings machine-learning machine-learning-models nlp nlp-deep-learning nlp-machine-learning python scraping-websites sentiment-analysis

Last synced: 25 Apr 2026

https://github.com/abdullahashfaqvirk/powerbi-dashboards

A collection of Microsoft Power BI dashboards and reports designed to address business challenges and support data driven decision-making.

dashboards data-analysis data-driven data-science microsoft powerbi reports visualization

Last synced: 10 Mar 2026

https://github.com/nurulashraf/customer-segmentation-hierarchical-clustering

A customer segmentation project using hierarchical clustering to group customers based on their spending behaviour and demographics. This helps businesses identify patterns and create targeted marketing strategies.

business-analytics clustering-algorithm customer-segmentation data-analysis hierarchical-clustering machine-learning python unsupervised-learning

Last synced: 18 Apr 2025

https://github.com/devexpress-examples/wpf-pivot-grid-define-custom-cell-template-to-performing-data-editing

This example shows how to edit a cell with the cell editing template in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 02 May 2026

https://github.com/devexpress-examples/wpf-pivot-grid-connect-to-an-olap-datasource

This example shows how to specify connection settings to the server and create fields that relate to specific measures and dimensions of the cube for the Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf xpf

Last synced: 06 May 2026

https://github.com/ginalamp/covid_dashboard_twitternews

Corona Dashboard & report based on Twitter media outlet news.

dashboard data-analysis data-visualization twitter

Last synced: 28 Jan 2026

https://github.com/sasanthns/sql_data_warehouse_project

A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.

data data-analysis data-science data-warehouse datacleaning etl etlpipeline sql sqlserver

Last synced: 24 Mar 2025

https://github.com/lit26/novel-corona-virus-2019

Data Analysis for Novel Corona Virus 2019

analysis coronavirus-case data-analysis sir-model

Last synced: 10 Jun 2025

https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas

SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.

data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase

Last synced: 18 May 2026

https://github.com/lucaso21/euro-2021-player-stats-analysis

A short project analyzing stats for players at the Euro 2021 tournament.

data-analysis data-science r rvest tidyverse

Last synced: 16 Mar 2025

https://github.com/lfariello/atmospheric_reentry

Matlab code for the determination of the reentry trajectory, deceleration profiles, and heat flux of the ARD capsule during orbital reentry into Earth's atmosphere.

data-analysis heat-flux-prediction heat-transfer hypersonic hypersonic-capsule matlab-programming trajectory-prediction

Last synced: 23 Mar 2025

https://github.com/tolumie/rfm-marketing-analysis

This project focuses on RFM (Recency, Frequency, and Monetary) Analysis, a powerful customer segmentation technique used in marketing and business analytics. The analysis helps businesses identify their most valuable customers, potential loyalists, at-risk customers, and churned users.

business-analytics customer-behavior-analysis customer-loyalty customer-retention customer-segmentation-analysis data-analysis data-driven-decisions ecommerce marketing-analytics python

Last synced: 18 May 2026

https://github.com/manel15279/datamining-project

A university project that aims to explore various data mining techniques like Data Exploration, Association Rule Mining, Supervised and Unsupervised Learning, applied to real-world datasets, focusing on soil fertility analysis and COVID-19 cases evolution over time.

covid-19 data-analysis data-mining data-visualization datascience gradio machine-learning python soil-properties

Last synced: 10 Jun 2025

https://github.com/b-varun-reddy/fairwai-bias-detection

Submission for the FairwAI Hospitality Intern Challenge. This project analyzes bias signals in Yelp hospitality reviews using open-source data, Python, and fairness-focused keyword detection.

bias-detection data-analysis ethical-ai fairness hospitality machine-learning natural-language-processing python social-impact yelp-dataset

Last synced: 19 Apr 2025

https://github.com/gabrieladados/analise-ecommerce

Análise SQL para E-commerce: Estratégias de Crescimento para Impulsionar Vendas

bigquery data-analysis ecommerce sql

Last synced: 31 Mar 2025

https://github.com/cescedes/this-is-jeopardy

Writing several functions that investigate a dataset of Jeopardy! questions and answers.

codecademy data-analysis python

Last synced: 11 Apr 2026

https://github.com/shiva16/da

Data Analytics - Study materials

analytics data-analysis data-science data-structures

Last synced: 07 Feb 2026

https://github.com/vbhvsingh0/matplotlib__egs

The codes here are examples of Matplotlib

data-analysis matplotlib-pyplot numpy-library pandas-python python3

Last synced: 28 May 2026

https://github.com/anilyigitsel/tourist-attraction-data-analysis

This project analyzes tourism trends from 2017 to 2021, focusing on visitor numbers, ratings, and attraction popularity during these years.

data-analysis data-visualization excel sql tourism

Last synced: 26 Jan 2026

https://github.com/bocchio01/skyward_recruitment_assignment

Assignment to join the PoliMi SkyWard software team

data-analysis kalman-filter model-rocket

Last synced: 15 Mar 2025

https://github.com/junpenglao/spafv

SPAFV - Surface Profile Analysis for Free Viewing eye movement experiment in 2AFC task

data-analysis statistics temporal-logic

Last synced: 31 Mar 2025

https://github.com/nahiyanhkhan/sales-insight-dashboard_powerbi

Build a dashboard to display the sales insights of a company's sales data over the 4 years period. It includes displaying revenue, sales quantity in different regions over the years.

dashboard data-analysis data-analytics data-visualization powerbi salesdashboard

Last synced: 08 Jan 2026

https://github.com/thenazar9/user-behavior-email-campaign-analysis-sql

Analysis of user behavior and email campaign performance using BigQuery and Looker Studio, focusing on account creation trends, email engagement, and user segmentation.

analytics bigquery data-analysis data-visualization etl looker-studio sql structured-query-language

Last synced: 16 Oct 2025

https://github.com/0-mostafa-rezaee-0/sandwich_structures

Impact test of Sandwich Structures

composite-materials data-analysis r

Last synced: 09 Aug 2025

https://github.com/divyanshugit/indian-judiciary-analysis

Analysis of Indian district court data across states.

classification data-analysis

Last synced: 02 Jul 2025