An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/gunifiri/duckdb-ghw

🦆 Accelerate analytics with DuckDB's integration for GitHub workflows, enabling efficient data handling and processing directly within your repositories.

analytics analytics-engine big-data columnar-storage data-analysis data-science database duckdb in-memory-database open-source parquet python query-planner r sql

Last synced: 29 Apr 2026

https://github.com/albertobarrago/sentinel

A contribute for the research of Corrado Malanga and Filippo Biondi

data-analysis sar

Last synced: 24 Oct 2025

https://github.com/pedrosfaria2/fugascomhelicoptero

Meu primeiro uso do Jupyter Notebook em um projeto

analise-de-dados data-analysis jupyter-notebook matplotlib pandas python

Last synced: 07 May 2026

https://github.com/biginformatics/git-basics

Hands-on Git and GitHub lessons for analysts and statisticians

data-analysis git github public-health training

Last synced: 10 Jun 2026

https://github.com/jatin-s16/hr_mysql_powerbi

This repository contains raw HR data along with key business questions. I performed data cleaning using MySQL queries and wrote analytical queries to extract meaningful insights. The results were then visualised using Power BI to enhance business understanding.

data-analysis data-science data-visualization mysql powerbi

Last synced: 29 May 2026

https://github.com/jofaval/sonar

Binary Classification of Sonar Signals of Rocks and Metal cylinders in 1987

data-analysis data-science data-visualization machine-learning python scikit-learn sonar uci

Last synced: 09 Apr 2026

https://github.com/satyacoder29/crm-analytics-power-bi

CRM Analytics Dashboard – An interactive dashboard using Tableau, SQL, and Salesforce CRM Analytics (CRMA) to analyze sales performance, customer segmentation, and churn prediction. Features automated ETL pipelines, predictive analytics, and real-time insights for data-driven decision-making. 🚀📊

advanced-excel data-analysis data-cleaning data-collection data-transformation data-visualization matplotlib numpy pandas powerbi python seaborn sql tableau

Last synced: 14 Apr 2026

https://github.com/riborings/python_projects

Python projects and other programming experiences

data-analysis machine-learning project python regression-analysis

Last synced: 08 May 2026

https://github.com/nishumehta/coffee-beans-sales-analysis

An in-depth analysis of coffee bean sales using an interactive Excel dashboard, which highlights trends and customer insights

dashboard data-analysis data-visualization excel

Last synced: 28 Jan 2026

https://github.com/janiavdv/data-spirits

Analysis of alcohol and sports betting data, including a correlation investigation.

correlation data-analysis data-science machine-learning

Last synced: 11 Nov 2025

https://github.com/lauratrigo/dias_geomagneticamente_calmos

📡Script MATLAB que analisa parâmetros ionosféricos (hF, f0F2, hmF2) via FFT, gerando espectros unilaterais/bilaterais para identificar padrões temporais em resolução, crucial para estudos de variações ionosféricas.

data-analysis geophysics matlab scientific

Last synced: 29 Aug 2025

https://github.com/jaseel342/pizza_sales_report

This Pizza Sales dashboards provide valuable insights, including sales trends, pizza category breakdown, size distribution, top-selling, and least-selling pizzas, enabling data-driven decisions to boost sales and business performance.

data-analysis dax-query power-query powerbi sql sql-server-management-studio visualization

Last synced: 05 Jan 2026

https://github.com/rita94105/ethereum-fraud-detection

This project focuses on detecting fraudulent transactions in the Ethereum network using both traditional machine learning models and deep learning techniques. By analyzing transaction attributes and interaction patterns, we aim to develop an effective fraud detection model.

data-analysis deep-learning ethereum fraud-detection machine-learning

Last synced: 01 May 2026

https://github.com/a26nine/kortext-usage-dashboard

An interactive data visualisation dashboard built using Tableau software to understand the value of digital resources issued on Kortext platform at Middlesex University, London.

data-analysis data-science data-visualization knime tableau

Last synced: 01 Feb 2026

https://github.com/psychelzh/cogstruct-old

Data Analysis on Cognitive Structure

cognition data-analysis intelligence psychology

Last synced: 25 Oct 2025

https://github.com/bnvulpe/regression-and-time-series

This work centers on assessing and comparing predictive models for regression and time series prediction using specific datasets, with the goal of selecting the most effective methodology for unseen test data.

colab data-analysis data-analysis-python data-science data-visualization forecasting jupyter-notebook machine-learning model-evaluation predictive-modeling python regression sarima sarimax time-series-analysis time-series-analysis-and-forecasting

Last synced: 08 May 2026

https://github.com/ljadhav25/linear_regression_data_science

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is called the independent variable.

data-analysis data-science linear-regression machine-learning

Last synced: 26 Oct 2025

https://github.com/madhursinghbhadoriya/data_analysis_fifa-players

• Using NumPy, Matplotlib, Pandas, etc processed important Information and Characteristic traits on Jupyter Notebook.

analysis data-analysis data-science graphs jupyter-notebook pandas python

Last synced: 07 May 2026

https://github.com/srinibas-masanta/infosys-springboard-internship

An interactive Power BI dashboard developed during my Infosys Springboard Internship to visualize Indian election trends. It integrates historical and live API data to analyze vote shares, turnout patterns, and demographic insights across constituencies, helping news agencies report results in real time.

dashboard data-analysis data-cleaning data-collection data-visualization dax-functions powerbi

Last synced: 25 Feb 2026

https://github.com/aidan-zamfir/the-iliad

Data analysis & relationship network for the characters of Homers Iliad

data data-analysis dataframes networks networkx python selenium spacy webscraping

Last synced: 08 May 2026

https://github.com/fahamidur/cuisine-analysis

This project analyzes recipes from AllRecipes.com to reveal global cooking patterns, nutritional trends, and cultural food differences, offering data-driven insights for food enthusiasts and researchers.

beautifulsoup data-analysis datavisualization pandas selenium tableau-public webscraping

Last synced: 08 May 2026

https://github.com/campagnucci/exercitando_pandas

Exercícios práticos de pandas com dados abertos da educação de São Paulo

data-analysis data-science education-data exercises pandas-tutorial

Last synced: 28 Jan 2026

https://github.com/andersoncrs/arboles_de_decision_calidad_del_vino

Contiene un análisis detallado de la calidad del vino utilizando un modelo de clasificación basado en árboles de decisión. Incluye la exploración de datos, detección y manejo de valores atípicos, análisis Univariado y Bivariado, y la creación y evaluación de un modelo predictivo. El objetivo principal es predecir la calidad del vino.

data-analysis data-science data-visualization machine-learning matplotlib seaborn sklearn tree-decision

Last synced: 20 May 2026

https://github.com/yandexdataschool/ml-sweights-experiments

Experiments for the "Machine Learning on data with sPlot background subtraction" paper

data-analysis high-energy-physics machine-learning statistics

Last synced: 15 May 2025

https://github.com/badranalyst/student-tests-data-analysis-application

Python-based analysis of student test scores in math, reading, and writing, examining correlations with parental education, lunch type, and test preparation. Includes data cleaning, visualization, and statistical insights into factors influencing academic performance.

data-analysis data-visualization dataset matplotlib numpy pandas python sklearn

Last synced: 05 May 2026

https://github.com/danmadeira/algoritmos-estatistica-python

Demonstração de Algoritmos de Estatística em Python

algorithms data-analysis data-science python statistics

Last synced: 08 May 2026

https://github.com/farzeen-2001/blinkit-sales-analysis-using-powerbi

The project provides an overview about the BlinkIt Sales performances

data-analysis data-visualization datacleaning excel powerbi

Last synced: 24 Jan 2026

https://github.com/sarathchandranpm/walmart-sales-analysis

Analysis of Walmart Myanmar's Q1 2019 sales data covering customer behavior, product performance, general operations, and sales patterns.

data-analysis mysql sql

Last synced: 29 Aug 2025

https://github.com/shreyaamenon/data-analysis-aiml-mini-projects

mini projects to help me grow skills in data analysis, artificial intelligence and machine learning.

ai data-analysis jupyter-notebook machine-learning python

Last synced: 11 Apr 2026

https://github.com/gaurabkundu1/road-accident-data-analysis

This is an Excel project on Road Accident Data Analysis in the form of an interactive Dashboard.

dashboard data-analysis data-vizualisation excel road-accidents

Last synced: 24 Jan 2026

https://github.com/diegopino/publibdata_codexhackathon

Public Library Data processing/analysis codex hackathon attempt

data-analysis data-visualization libraries public

Last synced: 24 Jan 2026

https://github.com/amanyadav-07/customer-churn-prediction

Machine Learning project to predict customer churn using Logistic Regression, Random Forest, and XGBoost. Includes data preprocessing, feature engineering, SMOTE balancing, model training, evaluation, and business insights.

accuracy-metrics data-analysis data-visualization logistic-regression machine-learning matplotlib numpy pandas python3 random-forest-classifier seaborn sklearn xgboost-classifier

Last synced: 11 Apr 2026

https://github.com/27ahmad/ibm-data-science-capstone

The Capstone is the final course in the IBM Data Science Professional Certificate program. It's a project that combines all the skills and knowledge you've gained throughout the specialization.

data-analysis data-science folium-maps machine-learning plotly-dash python sql

Last synced: 26 May 2026

https://github.com/rahulchouhan1/sql-data-warehouse-project

Building a modern data warehouse with SQL Server, including ETL Processes, data modeling, and analytics.

data-analysis data-cleaning data-engineering data-science data-warehouse datascience etl etl-pipeline sql sql-query sql-server

Last synced: 24 Jan 2026

https://github.com/0290192029/apartment-price-predictor

Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

apartment-price-prediction apartments-for-rent api correios-api data-analysis feature-engineering feature-enginering linear-regression linear-regression-models mlops numpy prediction-model r seaborn

Last synced: 08 May 2026

https://github.com/leftcoastnerdgirl/excel_crowdfunding_analysis

This project demonstrates the use of MS Excel for data cleansing & formatting to prepare for data analysis and visualization.

bar-charts conditional-formatting data-analysis data-analytics data-analytics-excel data-preparation data-preprocessing data-visualization excel line-graph

Last synced: 06 Feb 2026

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 16 Mar 2026

https://github.com/angelmtenor/idafc

Udacity's Intro to Data Analysis

data-analysis

Last synced: 20 Jun 2026

https://github.com/an4pdm/relatorio-de-vendas

O presente projeto foi feito através das ferramentas oferecidas pelo Power BI afim de aprimorar meus conhecimentos sobre ETL. Os dados utilizados foram de origem do site "Kaggle".

data-analysis data-visualization database etl powerbi

Last synced: 20 Jun 2026

https://github.com/annnieglez/fraud-detection-eda

Fraud Detection - Exploratory Data Analysis (EDA). Analyzing financial transactions to detect fraud patterns using Python and Tableau. Libraries: Pandas, Seaborn and Matplotlib. Key Focus: Data cleaning, fraud trends, high-risk transactions, time-based patterns

data-analysis data-science data-visualization eda fraud-detection fraud-prevention matplotlib seaborn

Last synced: 28 Jan 2026

https://github.com/yash1882/music-store-data-analysis

A project focuses on analyzing music store data using SQL ♬

begineer-friendly data-analysis music music-store-data music-store-data-analysis sql-project

Last synced: 28 Jan 2026

https://github.com/sabaasif2501/netflix-data-analysis

Exploratory data analysis of Netflix content using Python and pandas. Content types, genres, countries, and release years.

data-analysis netflix pandas portfolio-project python

Last synced: 08 May 2026

https://github.com/anurag-ghosh-12/library_management_system_sql

This project showcases the development of a comprehensive Library Management System utilizing Structured Query Language (SQL). It demonstrates a practical application of relational database principles to efficiently manage library resources, member information, and borrowing/returning transactions.

data-analysis data-visualisation dbms-project sql

Last synced: 29 Jan 2026

https://github.com/andreicirciumaru/best-of-breed

CSV fundamentals screener: schema validation + market-cap weights

csv data-analysis finance pandas python screener

Last synced: 15 Apr 2026

https://github.com/angchekar28/sales-report-power-bi

A Power BI sales report analyzing country-wise and product-wise sales trends. Includes dashboards, decomposition trees, and key influencers analysis for business insights.

dashboard data-analysis data-cleaning data-visualization powerbi sales-report

Last synced: 16 Mar 2026

https://github.com/engineertolulope/us_states_living_ranking_analysis

Python script for analyzing and ranking U.S. states based on factors like cost of living, tax burden, diversity, crime rates, and climate. Uses weighted criteria to identify the best states to live in according to these metrics. Ideal for decision-making on relocation.

data-analysis data-science linear-regression machine-learning python scikit-learn

Last synced: 29 Jan 2026

https://github.com/felinjob/ibm-applied-data-science-capstone

Este projeto, parte da especialização IBM Data Science Professional Certificate, prevê o sucesso do pouso do Falcon 9 da SpaceX. Usando dados da API da SpaceX e Web Scraping, o projeto inclui análise de dados e Machine Learning para gerar insights sobre os lançamentos.

data-analysis data-science data-visualization ibm jupyter-notebook machine-learning numpy pandas python scikit-learn seaborn sql

Last synced: 11 Apr 2026

https://github.com/wareflowx/excel-toolkit

A powerful command-line toolkit for Excel and CSV data manipulation, analysis, and transformation.

data-analysis data-wrangling excel pandas python uv

Last synced: 29 Jan 2026

https://github.com/mattdelaune/powerbi_healthcare_dashboard

Interactive Hospital Insights Dashboard built with Power BI, showcasing comprehensive analysis of patient demographics, treatment outcomes, and hospital performance.

data-analysis healthcare power-bi visualization

Last synced: 29 Jan 2026

https://github.com/emcramer/clockplot

Plotting utility for a "clockplot" that puts groups into a time-ordered heterogeneity visualization

biology data-analysis data-visualization heterogeneity pseudotemporal-ordering

Last synced: 10 Mar 2026

https://github.com/smahala02/magnetism-lab

This repository contains Python scripts and data for analyzing inductance in toroidal coils to calculate the magnetic permeability of ferrite materials. The project helps classify materials as soft or hard magnets based on experimental data.

data-analysis inductance jupyter-notebook magnetism python toroids

Last synced: 29 Jan 2026

https://github.com/edumoraes1/comissao-reduzida

Criação de segmentação de publico via SQL para nova feature do enjoei de comissão reduzida

bq data-analysis salesforce sql

Last synced: 06 Feb 2026

https://github.com/abhay-sinha-0/carpricepredictionproject

A machine learning project that predicts the selling price of a car based on its features such as year, mileage, fuel type, transmission, and more. This model can assist individuals and dealerships in estimating fair market prices for used cars.

artificial-intelligence data-analysis data-science data-visualization exploratory-data-analysis machine-learning-algorithms matplotlib-pyplot mysql-database numpy-library pandas-library python skit-learn sklearn-library

Last synced: 15 May 2025

https://github.com/surajwate/datalab

DataLab is a versatile toolkit designed to simplify data exploration, analysis, and visualization for data scientists.

data-analysis data-science python visualization

Last synced: 30 Jan 2026

https://github.com/mfakhriazhar/us-companies-revenue-dashboard

This project is a data visualization dashboard built using Power BI that highlights lists of the largest companies in the United States by revenue. The goal is to provide an interactive overview of company performance across industries, focusing on revenue, employee metrics, and industry trends.

dashboard data-analysis data-visualization largest-companies-us powerbi revenue united-states

Last synced: 30 Jan 2026

https://github.com/mfakhriazhar/healthcare-dashboard-project

This project is a comprehensive data analysis and visualization of healthcare data using Power BI. It focuses on understanding patient distribution, billing trends, and hospital performance through a clean and interactive dashboard.

dashboard dashboardreporting data-analysis datacleaning excel powerbi powerquery

Last synced: 30 Jan 2026

https://github.com/touchesir/twitter_physicalactivity

Companion Data / Analysis for "Monitoring Physical Activity Levels using Social Media Data"

data-analysis twitter

Last synced: 30 Jan 2026

https://github.com/samanhur/data_visualization_pcc

First experiences in data visualization with python

data-analysis data-science data-visualization python3

Last synced: 23 Mar 2025

https://github.com/satvikpraveen/numpymasterpro

A hands-on, production-ready toolkit to master NumPy — from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.

broadcasting data-analysis data-science data-source data-visualization jupyter-notebook kmeans-clustering linear-algebra machine-learning matrix-algebra numerical-computation numpy numpy-broadcasting numpy-examples numpy-tutorial open-source python scientific-computing standardization vectorization

Last synced: 08 May 2026

https://github.com/nehar-2404/airbnb-nyc-eda-ml

This project analyzes Airbnb listings in New York City to uncover key insights about pricing, host activity, and neighborhood trends. It covers data cleaning, EDA, and basic machine learning to predict listing prices.

airbnb data-analysis eda machine-learning matplotlib pandas pyhton seaborn visualization

Last synced: 15 Apr 2026

https://github.com/bala-1409/power-bi-visualization-project

This repository contains Visualization Projects which is visualized through Power BI Software, by using the visualization we can gain multiple insights and strategies which helps to develop the business for gaining high profit margins and by the insights we can reduce the damages by accidents & calamities.

dashboard data-analysis data-science data-visualization exploratory-data-analysis microsoft-excel microsoft-power-bi microsoft-powerpoint power-bi powerbi powerbi-reports powerbi-visuals visualization

Last synced: 04 Jan 2026

https://github.com/aavishkarmahajan/sql

SQL code assignments and practice questions from SQL courses, SQL data analysis

data-analysis sql sql-server

Last synced: 07 Feb 2026

https://github.com/sakan811/stress-pattern-occurrence-in-english-words

This project is intended to provide English learners with data that allows them to make a data-driven guess when encountering words that they aren't sure where to stress

data-analysis data-visualization english english-language english-learning language powerbi powerbi-report powerbi-visuals

Last synced: 20 Jun 2026

https://github.com/gurpreet17/uc-davis-sql-for-data-science-specialization

Completed the SQL Basics for Data Science Specialization from the University of California, Davis, gaining proficiency in Data Analysis, SQL, Apache Spark, and Delta Lake.

apache-spark bigdata data-analysis data-science delta-lake sqlite

Last synced: 15 Apr 2026

https://github.com/pedrosfaria2/analisandopostshn

Projeto para analisar as postagens da comunidade HackerNews

analise-de-dados data-analysis datetime jupyter-notebook matplotlib python python3

Last synced: 08 May 2026

https://github.com/ved-coder-king/wheat_ai_project

This project, Smart Wheat Farming AI System, was developed as part of the coursework for the Artificial Intelligence program at Esprit School of Engineering.

agriculture data-analysis data-visualization deep-learning image-classification machine-learning object-detection python wheat

Last synced: 15 Apr 2025

https://github.com/luminati-io/indeed-dataset-samples

A sample dataset of over 1000 Indeed job listings, extracted using the Bright Data API, ideal for market analysis and growth.

api data-analysis datasets indeed jobs web-scraping

Last synced: 07 Feb 2026

https://github.com/tralahm/parliament-2017-dataset

Concise, Clean data sets of the 2017 Kenyan General Election results for the Members of the Senate and National Assembly Composition

csv-parsing data-analysis data-visualization datasets election-data ipynb-jupyter-notebook kaggle-dataset kenya-constituencies kenya-counties matplotlib python3 tralahtek

Last synced: 31 Jan 2026

https://github.com/jofaval/titanic-disaster

Data Analysis of the famous Titanic Disaster in 1912 with Machine Learning

classification data-analysis data-science data-visualization google-colab kaggle machine-learning python scikit-learn

Last synced: 15 Apr 2026

https://github.com/amishidesai04/flipkart-mobile-sales-analysis

Flipkart Mobile Sales Analysis is a Tableau project that visualizes mobile sales data from Flipkart. It highlights trends in brand performance, pricing, ratings, and customer preferences. The interactive dashboard helps users explore key insights for data-driven decisions in e-commerce and retail.

dashboard data-analysis data-visualization storyboard tableau

Last synced: 31 Jan 2026

https://github.com/traore-07/fedex-sales-analysis

Analysis of the FedEx Sales Transaction

data-analysis data-visualization sales-analysis tabeau

Last synced: 31 Jan 2026

https://github.com/tenifayo/analysis-of-fordgobike-trip-data

Data Visualization using Ford GoBike Trip Data

data-analysis matplotlib pandas

Last synced: 11 Jul 2025

https://github.com/cca/panopto-session-data

analyzing Panopto session data for retention purposes

data-analysis ipython-notebook video

Last synced: 07 Feb 2026

https://github.com/allanotieno254/bank-loan-analysis-dashboard-power-bi

An interactive Power BI dashboard that analyzes bank loan data to provide insights into approval trends, default risks, and customer profiles. Designed to assist financial institutions in making data-driven lending decisions.

bank-loans business-intelligence dashboard data-analysis financial-analysis power-bi risk-assessment

Last synced: 31 Jan 2026

https://github.com/ginanti-riski/streamlit_datapenyewaansepeda

Analisis Bike Sharing adalah proyek yang bertujuan untuk memahami pola penyewaan sepeda berdasarkan berbagai faktor seperti cuaca, musim, dan hari. Proyek ini menggunakan teknik analisis data untuk mendapatkan wawasan yang lebih dalam mengenai tren peminjaman sepeda.

data-analysis data-analysis-python data-science data-visualization python streamlit

Last synced: 15 Apr 2026

https://github.com/malthejorgensen/repx

Python regular expression file transformer

command-line-tool data-analysis text-processing

Last synced: 31 Jan 2026

https://github.com/gastonstat/stat133

STAT 133: Concepts in Computing with Data

data-analysis data-science data-visualization r-programming syllabus

Last synced: 25 Feb 2026

https://github.com/aphp/jupyter-eds-notebooks

jupyter-eds-notebooks provides Docker images with preconfigured Jupyter environments for clinical and health data analysis, tailored for AP‑HP Datalabs and the HELIX platform.

data-analysis data-science data-visualization healthcare lab

Last synced: 13 Jan 2026

https://github.com/tolumie/loan-approval-prediction

Loan Approval Prediction using Machine Learning | EDA + Decision Tree, Random Forest & Logistic Regression | Automating loan eligibility for Dream Housing Finance by analyzing customer data and predicting loan approvals.

classification credit-risk-analysis data-analysis decision-tree-classifier finance-analytics loan-approval logistic-regression-algorithm machine-learning predictive-modeling-techniques random-forest

Last synced: 30 Jun 2025

https://github.com/deepanshkhurana/udacityproject-prediciting-boston-housing-prices

This is a Udacity Project for the Machine Learning Nanodegree. Here, we are trying to predict Boston Housing Prices using sklearn.

data-analysis data-science machine-learning python scikit-learn udacity

Last synced: 08 May 2026