An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/shiva16/da

Data Analytics - Study materials

analytics data-analysis data-science data-structures

Last synced: 07 Feb 2026

https://github.com/allanotieno254/bank-loan-analysis-dashboard-power-bi

An interactive Power BI dashboard that analyzes bank loan data to provide insights into approval trends, default risks, and customer profiles. Designed to assist financial institutions in making data-driven lending decisions.

bank-loans business-intelligence dashboard data-analysis financial-analysis power-bi risk-assessment

Last synced: 31 Jan 2026

https://github.com/cyberoctane29/epa-carbon-monoxide-aqi-analysis

This project continues my EPA Air Quality AQI Analysis, focusing on carbon monoxide levels in EPA data. Using Python, I applied statistics, probability analysis, outlier detection, sampling, and hypothesis testing to assess pollution and health impacts. Leveraging Pandas, NumPy, SciPy, and Matplotlib, it supports environmental policy decisions.

data-analysis eda hypothesis-testing probability-distribution sampling sampling-distribution statistical-analysis

Last synced: 24 Mar 2025

https://github.com/steviecurran/gbt-scripts

IDL scripts for the reduction of Green Bank Telescope data

data-analysis data-compression data-visualization radio-astronomy spectroscopy

Last synced: 31 Jan 2026

https://github.com/gabrieladados/analise-ecommerce

Análise SQL para E-commerce: Estratégias de Crescimento para Impulsionar Vendas

bigquery data-analysis ecommerce sql

Last synced: 31 Mar 2025

https://github.com/rita94105/ethereum-fraud-detection

This project focuses on detecting fraudulent transactions in the Ethereum network using both traditional machine learning models and deep learning techniques. By analyzing transaction attributes and interaction patterns, we aim to develop an effective fraud detection model.

data-analysis deep-learning ethereum fraud-detection machine-learning

Last synced: 01 May 2026

https://github.com/alex-pierron/ekip-enedis-genai

Repository for the team "Ekip" during the H-GenAI Hackathon 2025 organized at SIA Partners, Paris, France

amazon-nova artificial-intelligence aws aws-lambda data-analysis database generative-ai mistral nlp

Last synced: 15 Apr 2026

https://github.com/ajmannust41288/data-analyst

Data Analyst ,Microsoft Professional expert,Desktop PowerBi ,Tablue and Dashboards with ChatGP4 AI uses

business-analytics data-analysis data-analyst data-analytics eda

Last synced: 01 Feb 2026

https://github.com/emediongfrancis/unified-data-lake-implementation-gcp-kafka-airflow-snowflake

This project demonstrates the integration of data from multiple sources into a unified data lake. The project showcases the use of Apache Airflow for ETL tasks, Google Cloud Storage as a data lake, Apache Kafka for data movement automation, Snowflake for data warehousing, and Google BigQuery for analysis.

airflow data-analysis data-warehousing etl etl-pipeline gcp-storage kafka snowflake value variety

Last synced: 07 Feb 2026

https://github.com/asghar-rizvi/world-energy-consumption-analysis-1965-2023-

An in-depth analysis of global energy consumption trends from 1965 to 2023, using data from various countries and regions.

data-analysis data-analysis-python data-science python real-world-data real-world-data-analysis real-world-problem-solving real-world-project visulaization

Last synced: 15 Apr 2026

https://github.com/atharvapathak/rsvp_movies_case_study

SQL queries performed on IMDb database to provide recommendations to RSVP Movies based on insights.

data-analysis data-cleaning data-science imdb-dataset rsvp-movies sql

Last synced: 28 Jan 2026

https://github.com/rissh/titanicsurvivalpredictionusingml

Predicting Titanic passenger survival through machine learning. This project includes data preprocessing, exploratory data analysis, feature engineering, and model training using Python. 🚢

data data-analysis data-science data-visualization dataanalysis jupiter-notebook machine-learning machine-learning-algorithms machinelearning matplotlib numpy pandas prediction prediction-model python python3 seaborn tenserflow tflearn titanic

Last synced: 01 Feb 2026

https://github.com/rohitdusane/healthcare-analytics

𝐏𝐨𝐰𝐞𝐫 𝐁𝐈 𝐃𝐚𝐬𝐡𝐛𝐨𝐚𝐫𝐝 is designed to provide valuable insights into patient waiting times across outpatient and inpatient healthcare services. It offers a comprehensive analysis of key factors influencing wait lists, including Age Profile, Specialty, Time Bands, and Patient Case Types.

data-analysis data-visualization dax dax-query healthcare-analysis powerbi-report

Last synced: 01 Feb 2026

https://github.com/hemangsharma/job-tracker

A comprehensive Streamlit application for tracking and analyzing job applications.

data-analysis python streamlit-dashboard streamlit-webapp

Last synced: 15 Mar 2025

https://github.com/nagar2nd/jenson-usa-mysql-analysis

We are analyzing Jenson USA's dataset to gain valuable insights into customer behavior, staff performance, inventory management, and store operations. By crafting advanced SQL queries, the analysis explores key metrics such as product sales, customer spending, and order patterns, ultimately guiding strategic decision-making and operations.

data-analysis problem-solving sql

Last synced: 01 Feb 2026

https://github.com/vanshuchaudhary/flightpriceanalysis-

The uploaded file is a Jupyter Notebook titled "Flight Analysis". It likely involves analyzing flight-related data, potentially exploring trends, patterns, or insights using data science techniques. The analysis might include data visualization, statistical analysis, or predictive modeling.

business-analytics data data-analysis data-visualization datainsights datascience matplotlib-pyplot python seaborn seaborn-plots seaborn-python sns statistical-analysis

Last synced: 08 May 2026

https://github.com/tolumie/rfm-marketing-analysis

This project focuses on RFM (Recency, Frequency, and Monetary) Analysis, a powerful customer segmentation technique used in marketing and business analytics. The analysis helps businesses identify their most valuable customers, potential loyalists, at-risk customers, and churned users.

business-analytics customer-behavior-analysis customer-loyalty customer-retention customer-segmentation-analysis data-analysis data-driven-decisions ecommerce marketing-analytics python

Last synced: 18 May 2026

https://github.com/vladimiracunadev-create/python-data-science-program

Python Data Science Program — 197 clases en 9 partes. Pauta avanzada derivada de Géron, VanderPlas, Huyen, ISLP y Barocas/Hardt/Narayanan. Recurso personal de aprendizaje, enseñanza y mejora continua.

bootcamp data-analysis data-science education jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 01 Jun 2026

https://github.com/lfariello/atmospheric_reentry

Matlab code for the determination of the reentry trajectory, deceleration profiles, and heat flux of the ARD capsule during orbital reentry into Earth's atmosphere.

data-analysis heat-flux-prediction heat-transfer hypersonic hypersonic-capsule matlab-programming trajectory-prediction

Last synced: 23 Mar 2025

https://github.com/shubham200137/customer-churn-analysis

In this case study, we analyze customer churn for a telecom company serving Southern California. The company faces increased competition and wants to retain customers by understanding the reasons for churn. Our objectives include improving service quality, identifying churn factors, pinpointing attractive services, and retaining high LTV customers.

data-analysis data-visualization numpy-python pandas-python sqlite tableau

Last synced: 15 Apr 2026

https://github.com/lucaso21/euro-2021-player-stats-analysis

A short project analyzing stats for players at the Euro 2021 tournament.

data-analysis data-science r rvest tidyverse

Last synced: 16 Mar 2025

https://github.com/sroman0/data-analytics

Data Analytics Exercises is a collection of comprehensive university-level exercises aimed at enhancing skills in data analytics. The repository includes practical notebooks covering data manipulation, exploratory data analysis (EDA), statistical analysis, data visualization, and machine learning fundamentals.

data-analysis data-analytics data-science data-visualization education exercises exploratory-data-analysis hands-on-practice jupyter-notebook machine-learning python statistics

Last synced: 15 Apr 2026

https://github.com/aonurakman/data-analysis-and-ml-algorithms

An exploration of data analysis techniques and standard ML algorithms on QSAR oral toxicity dataset. - 2021 - Yıldız Technical University

classification clustering data-analysis data-mining isolation-forest python regression

Last synced: 20 Jun 2026

https://github.com/mdaltamashalam/uber-fare-prediction-models

Predicts the fare amount of Uber rides based on various factors such as pickup/drop-off coordinates, passenger count, and trip distance.

catboost data-analysis data-cleaning data-visualization lgbm-regressor machine-learning matplotlib numpy pandas python random-forest regression-models skit-learn xgboost-algorithm

Last synced: 26 Feb 2026

https://github.com/siddhant2105s/airline-performance-analysis-dashboard

Enhancing Airline Performance Analysis for the Department of Transport

data-analysis data-visualization tableau

Last synced: 08 Feb 2026

https://github.com/josericodata/statisticsapp

Interactive statistics analysis app using Python and Streamlit. Perform key statistical tests, visualise distributions, and explore data with ease.

alpha-value chi-square-test confidence-intervals data-analysis dublin dublin-ireland europe hyphotesis-tests ireland normal-distribution null-hypothesis p-value portfolio python statistics streamlit t-test tech ubuntu z-test

Last synced: 26 Feb 2026

https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas

SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.

data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase

Last synced: 18 May 2026

https://github.com/prakashjha1/new-analysis-using-llm-locally

An interactive news analysis tool built with Streamlit and local LLMs. This app allows users to analyze and gain insights from the latest news articles using advanced language models, all running locally. Explore trends, sentiment, and key topics with an intuitive interface.

artificial-intelligence data-analysis data-science llms ollama python streamlit

Last synced: 14 Mar 2025

https://github.com/an1mch1k-theone/project_1_hh_analyze

Проект: анализ резюме из HeadHunter

data-analysis data-analysis-project python

Last synced: 15 Apr 2026

https://github.com/aisurjyasamantaray/-optimizing-target-s-brazilian-operations-insights-from-order-processing-pricing-and-payment-trends-

This project offers an in-depth analysis of consumer behavior, logistical performance, and payment preferences within the e-commerce sector. By examining order costs, delivery times, and payment methods, businesses can uncover valuable insights into operational efficiency and customer preferences.

bigquery consumer-insights data-analysis database sql target

Last synced: 26 Feb 2026

https://github.com/ginalamp/covid_dashboard_twitternews

Corona Dashboard & report based on Twitter media outlet news.

dashboard data-analysis data-visualization twitter

Last synced: 28 Jan 2026

https://github.com/27ahmad/amazon-sales-analysis

This repository contains an exploratory data analysis (EDA) and visualization project of Amazon sales data. The goal is to uncover insights and present key metrics through a Tableau dashboard.

data-analysis eda pandas python seaborn tableau

Last synced: 15 Apr 2026

https://github.com/ninadpatil09/heart_disease_detection_analysis

The Heart Disease Detection Analysis aims to create a predictive model for identifying individuals at risk of heart disease. Using a dataset with attributes like age, sex, and health metrics, the project focuses on distinguishing patients with and without heart disease.

data-analysis data-cleaning data-science data-visualization machine-learning

Last synced: 15 Apr 2026

https://github.com/rajeev2806/netflix-data-analysis

In this project i have implemented ETL . I used netflix dataset to clean and analyze using postgresql and python

data-analysis data-cleaning postgresql python

Last synced: 15 Apr 2026

https://github.com/ludreinsalvador/global-covid-19-data-analysis

Contains Power BI dashboards that visualizes and analyzes global COVID-19 cases, deaths, and vaccination trends using data from the World Health Organization (WHO). The project aims to provide insights into the pandemic’s impact and vaccination progress worldwide through dynamic reports and advanced analytics.

analytics covid-19 covid19-data data data-analysis data-collection data-transformation data-visualization

Last synced: 26 Feb 2026

https://github.com/mathusanm6/critics-vs-players-analysis

This data analysis examines the relationship between critic scores, sales (owners), player engagement, and pricing to determine the ROI of critic reviews.

data-analysis data-science data-visualization game-reviews games-sales jupyter-notebook python-3 steam-games

Last synced: 16 Apr 2026

https://github.com/shruti23-ui/blinkit-powerbi-dashboard

A comprehensive Power BI dashboard analyzing Blinkit's sales performance, outlet metrics, and multi-tier market analytics with interactive visualizations and business intelligence insights.

data-analysis data-visualization microsoft-excel microsoft-power-bi powerbi sales-analysis sql

Last synced: 09 Feb 2026

https://github.com/akashvarma26/data-analysis-on-olympics-csv-dataset

Data Analysis on Olympics dataset of csv format using re and Pandas in Jupyter notebook.

data-analysis jupyter-notebook pandas regex

Last synced: 02 May 2026

https://github.com/tushar2704/imdb-movie-analysis

This project extracts meaningful insights, trends, and patterns from the data, shedding light on various aspects of the movie industry. By leveraging this analysis, filmmakers, studios, and enthusiasts can gain valuable information to inform decision-making, understand audience preferences, and contribute to the creation of successful movies.

artificial-intelligence data-analysis data-science imdb project tushar2704

Last synced: 10 Feb 2026

https://github.com/jayavarshini-jayakumaran/nba-exploratory-data-analysis

A data analytics project that explores NBA game and player data using Python and Power BI. Features data preprocessing, EDA, feature engineering, and an interactive dashboard for visualizing team and player performance trends.

data-analysis data-visualization exploratory-data-analysis powerbi python3

Last synced: 20 Jun 2026

https://github.com/sreekar0101/bank-financial-loan-performance-trend-analysis

About This project analyzes the performance trends of financial loans using SQL for data extraction and Tableau for visualization. The goal was to perform exploratory data analysis (EDA) to understand key metrics like loan applications, funded amounts, interest rates, and debt-to-income ratios using sql and tableau for visualization

data-analysis data-visualization sql tableau

Last synced: 27 Feb 2026

https://github.com/shahriarha/sql

Structured query language

data-analysis mysql mysql-database sql

Last synced: 02 Sep 2025

https://github.com/prateekbisht23/inventory_management

This project is an Inventory Management System built using Python (Pandas, NumPy, SciPy) and Jupyter Notebook. It allows efficient tracking of stock, performing data analysis, and generating useful statistical insights (mean, standard error, confidence intervals) to support better decision-making.

data-analysis jupyter-notebook management python3

Last synced: 11 Feb 2026

https://github.com/adilshamim8/eda-on-health-and-sleep-data

Exploratory Data Analysis (EDA) on health and sleep data, uncovering patterns and insights using Python and visualization tools.

data-analysis data-visualization eda health healthcare sleep sleep-analysis

Last synced: 15 Mar 2025

https://github.com/nickenshidqia/startup-venture-funding-dashboard-data-analysis

The Startup Venture Funding Dashboard is a comprehensive visual representation of the dynamic landscape of startup funding, providing valuable insights into the top startups, funding round types, markets, startup statuses, and investor details.

dashboard data-analysis tableau tableau-dashboards

Last synced: 11 Feb 2026

https://github.com/multitagging/benchmarks

Provides benchmarks to test the MultiTagging framework

benchmarks data-analysis ethereum smart-contracts vulnerabilities

Last synced: 11 Feb 2026

https://github.com/vikktor93/proyecto-final-python-datascience

Dataset analysis of worldwide sales of video games on different platforms in 2020

data-analysis data-science jupyter-notebook kaggle matplotlib pandas python seaborn

Last synced: 16 Apr 2026

https://github.com/dhruwsunita/car-sales-dashboard

Car sales dashboard using Tableau visualization tool.

car-sales data-analysis data-visualization excel kpis tableau

Last synced: 27 Feb 2026

https://github.com/ohimoiza1205/mastercard-cybersecurity-simulation

Served as an analyst on Mastercard’s Security Awareness Team to identify and report security threats

cybersecurity data-analysis data-presentation security-awareness-training technical-security-awareness

Last synced: 11 Feb 2026

https://github.com/joemull/pyjade

A data curation script for the Jane Addams Digital Edition

data-analysis digital-humanities

Last synced: 11 Feb 2026

https://github.com/devexpress-examples/wpf-pivot-grid-connect-to-an-olap-datasource

This example shows how to specify connection settings to the server and create fields that relate to specific measures and dimensions of the cube for the Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf xpf

Last synced: 06 May 2026

https://github.com/abelarduu/power_bi_analyst

Projeto Power BI para relatório de dados financeiros, com navegação intuitiva e recursos interativos. Oferece uma experiência completa ao usuário, combinando apresentação sofisticada e funcionalidade eficaz para análise de dados.

dashboard data-analysis data-analytics modelagem-de-dados powerbi tratamento-de-dados

Last synced: 08 Sep 2025

https://github.com/kailenroa/dashboad-excel-huisprijzen

This project focuses on developing a dashboard powered by Funda to visualize house pricing in the Netherlands. The dashboard simplifies the home-buying process by allowing users to compare prices, energy labels, number of rooms, and square meters across different provinces, all in one interactive platform..

dashboard data-analysis excel house-prices

Last synced: 05 Jan 2026

https://github.com/abdullahashfaqvirk/powerbi-dashboards

A collection of Microsoft Power BI dashboards and reports designed to address business challenges and support data driven decision-making.

dashboards data-analysis data-driven data-science microsoft powerbi reports visualization

Last synced: 10 Mar 2026

https://github.com/bala-1409/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation database eda etl-framework exploratory-data-analysis microsoft-sql-server query-language sql sql-server sql-server-database sql-server-management-studio

Last synced: 27 Feb 2026

https://github.com/rohitblaze10/-excel-_seller_store_analysis

A collection of data analysis projects showcasing data cleaning, exploration, visualization, and machine learning. Using "Excel" and more to uncover insights and drive data-driven decision-making. Feel free to explore, contribute, or collaborate!

data-analysis data-visualization excel excel-export

Last synced: 12 Feb 2026

https://github.com/andimashkulli/vpms

Vehicle Parking Management System for Gjon Buzuku Gymnasium

backend-api data-analysis databases frontend-react mongodb nodejs software

Last synced: 12 Feb 2026

https://github.com/nabilshadman/power-bi-essential-training

Exercise files for Power BI Essential Training (2024): datasets and dashboards for hands-on learning

dashboard data-analysis data-science data-visualization power-bi power-bi-dashboard

Last synced: 12 Feb 2026

https://github.com/ankit21111/carpredict

This project predicts car prices using machine learning models, including Simple and Multiple Linear Regression. It covers data acquisition, feature selection, and optimization techniques like Ridge Regression. The best model, Multiple Linear Regression, achieved an R² score of 0.84. Check out the full analysis in the repository!

data-analysis data-visualization matplotlib numpy pandas pyhton scipy seaborn sklearn

Last synced: 16 Apr 2026

https://github.com/taralas209/moscow-programmer-salaries-analysis-dvmn

A Python script analyzing the average salaries of programmers in Moscow by popular programming languages using data from HeadHunter and SuperJob.

api data-analysis headhunter job-market-analysis python superjob

Last synced: 15 Mar 2025

https://github.com/rahulsm20/storedata

A data analysis project aimed at analyzing the sales data of the super store and providing useful insight into customer preferences.

data-analysis matplotlib numpy pandas python streamlit

Last synced: 16 Apr 2026

https://github.com/walid0912/rfm_analysis

RFM Analysis is employed to comprehend and categorize customers according to their purchasing patterns. RFM, an acronym for recency, frequency, and monetary value, comprises three essential metrics that offer insights into customer involvement, allegiance, and significance to a business.

data-analysis data-visualization python rfm-analysis

Last synced: 02 Sep 2025

https://github.com/felpzreiz/stockdata_pipeline

Este projeto consiste no desenvolvimento de um pipeline de dados que consome informações financeiras de uma API da Bolsa de Valores Americana (StockData.org) para análise e tratamento. Utilizando Python e bibliotecas como pandas, matplotlib e pyarrow

api data-analysis data-science jupyter-notebook pandas python

Last synced: 19 Apr 2026

https://github.com/loginchik/mid_contracts

Анализ контрактов государственных закупок МИДа РФ

data-analysis dataset pandas python

Last synced: 17 Apr 2025

https://github.com/alejandrolara11/data-preprocessing

Data preprocessing through the use of the libraries NumPy and pandas.

data-analysis data-cleaning data-preprocessing numpy pandas python

Last synced: 09 May 2026

https://github.com/omkar2503/credit-risk-dashboard

A SQL-based Credit Risk Scoring System visualized using Metabase

credit-risk dashboard data-analysis data-analytics metabase postgresql sql

Last synced: 01 Jul 2025

https://github.com/mananabbasi/dashboard-power-bi

This repository showcases **Power BI projects** focused on data visualization and business intelligence. Each project transforms raw data into interactive dashboards and reports, providing actionable insights for decision-making. The repository includes Power BI files, datasets, and documentation for each project.

data-analysis data-science data-visualization powerbi

Last synced: 13 Feb 2026

https://github.com/kernelshreyak/kaggle-notebooks

Collection of my Kaggle notebooks for data analysis and machine learning on a variety of datasets

data-analysis data-science data-visualization kaggle kaggle-competition machine-learning

Last synced: 27 Apr 2026

https://github.com/l1ght14/customer-churn-prediction

Predict customer churn using machine learning models like Logistic Regression and Random Forest. Includes data preprocessing, model evaluation, feature importance, and insights to drive retention strategies.

churn-prediction classification customer-churn customer-churn-prediction data-analysis logistic-regression machine-learning python random-forest scikit-learn telecom

Last synced: 09 May 2026

https://github.com/BAMresearch/Utah-SAXS-Tools

The Utah SAXS Tools (USToo), adapted for Python 3, originally by David P. Goldenberg, 2009-2012

data-analysis saxs small-angle-scattering small-angle-xray-scattering

Last synced: 16 Jan 2026

https://github.com/chanmeng666/mnist-handwritten-digit-recognition-project

【Sprinkle some star dust on this repo! ⭐️ It's good karma!】A comprehensive implementation and analysis of handwritten digit recognition using multiple neural network architectures on the MNIST dataset. Features basic MLP, optimized feature-selected model, and deep CNN approaches with detailed performance comparisons and visualizations.

cnn computer-vision data-analysis data-visualization deep-learning feature-analysis handwritten-digit-recognition keras machine-learning mlp mnist model-optimization neural-networks python scikit-learn tensorflow

Last synced: 02 Apr 2026

https://github.com/jedrzej-wydra/improving-accuracy

Improving accuracy of age estimates for insect evidence—calibration of physiological age at emergence (k) using insect size but without “k versus size” model

data-analysis r

Last synced: 02 Sep 2025

https://github.com/emanoelcampos/python-onemonth

This repository contains educational materials and projects developed during a Python course offered by OneMonth. It covers Python basics, intermediate concepts, web development with Flask, and data analysis with pandas. The course is structured into weeks, each focusing on a different aspect of Python programming and its applications.

data-analysis flask jupyter-notebook onemonth python python3

Last synced: 09 May 2026

https://github.com/suhail25/pizza-sales-analysis

Delved into detailed analysis of sales data presented in Excel by Pizza sales manager; implemented strategic pricing adjustments resulting in a 25% revenue surge and enhanced profit margins. Explore and cleaned the data set using SQL and then performed data analysis by filtering the 12% of data using SQL commands in MySQL.

data-analysis excel powerpoint-presentations sql

Last synced: 15 Feb 2026

https://github.com/andersoncrs/analisis_exploratorio_de_datos-eda-_rendimiento_estudiantil

Este análisis exploratorio de datos (EDA) realizado sobre el conjunto de datos de rendimiento estudiantil tiene como objetivo identificar y comprender los factores que influyen en el desempeño académico de los estudiantes. A través de la limpieza, transformación y visualización de datos, se busca descubrir patrones y relaciones significatvas.

data-analysis data-exploration data-exploration-and-preprocessing data-visualization seaborn

Last synced: 30 Mar 2025