An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/reusjimenez/powerbi-data-analysis

Dashboards interactivos desarrollados en Power BI, orientados al análisis de datos y visualización efectiva. 📊

business-intelligence dashboards data-analysis dax power-query powerbi

Last synced: 28 Jan 2026

https://github.com/jku-vds-lab/loops

Loops is a JupyterLab extension to support iterative and exploratory data analysis in computational notebooks.

data-analysis data-science data-visualization jupyter jupyter-notebook notebook provenance

Last synced: 29 Jan 2026

https://github.com/titanscouting/tra-analysis

Titan Robotics 2022 Strategy Team Analysis Repository

data-analysis frc frc-scouting hacktoberfest python

Last synced: 29 Jan 2026

https://github.com/seekinginfiniteloop/fedcal

A feature-rich Python calendar that enables time series analyses of changes in federal workforce schedules and shifts in executive department funding status.

data-analysis data-science econometrics economic-data economics federal federal-government hr pandas pandas-library pandas-python pydata python

Last synced: 15 Apr 2026

https://github.com/stastnypremysl/lsql-csv

lsql-csv is a tool for small CSV file data querying from a shell with short queries. It makes it possible to work with small CSV files like with a read-only relational databases. The tool implements a new language LSQL similar to SQL, specifically designed for working with CSV files in a shell. LSQL aims to be a more lapidary language than SQL.

csv data-analysis data-processing haskell language linux-shell lsql lsql-csv new-language query-language relational-database sql unix-command unix-philosophy unix-shell

Last synced: 25 Feb 2026

https://github.com/nakshjainsonigara/vba-canteenmanagementsystem

The Canteen Management System is a comprehensive software solution designed to modernize and optimize canteen operations. It aims to simplify the complexities of managing a canteen by automating key processes such as order management, payment processing, and report generation.

canteen canteen-mangement-system charts data-analysis email excel microsoft payment-gateway vba vba-excel vba-macros word

Last synced: 30 Jan 2026

https://github.com/tynoee/record_company-database

A record company database with multiple query commands using SQL

data-analysis sql

Last synced: 31 Jan 2026

https://github.com/madeiradata/microsoft-data-analysts-club

Open-source Repository of Useful Scripts and Solutions for Microsoft Data Analysts

data-analysis data-visualization microsoft-data-analysis powerbi powerbi-report

Last synced: 19 Mar 2026

https://github.com/jakobzmrzlikar/fake-news-analysis

An analysis of the FakeNewsNet dataset using NLP techniques.

data-analysis fake-news ipynb-jupyter-notebook nlp-machine-learning

Last synced: 05 Mar 2026

https://github.com/phillbertnevinemmanuel/dataprofessionalquestionnaireanalysis_pbix

This project delves into the demographics and job satisfaction of data professionals, presenting insights through a user-friendly dashboard built with Power BI. The dataset has been graciously provided by Alex The Analyst

dashboard data-analysis powerbi visualization

Last synced: 19 Mar 2026

https://github.com/allanotieno254/employee-performance-tracker-excel-

An Excel-based tool to track and evaluate employee performance, compliance, and skills assessments with summary statistics and visual charts

compliance-tracker data-analysis employee-performance-analysis excel human-resources

Last synced: 19 Mar 2026

https://github.com/allanotieno254/road-accident-data-analysis-dashboard-using-excel

This repository contains the Road Accident Data Analysis Dashboard, a comprehensive Excel-based tool designed to provide in-depth analysis and visualization of road accident data.

dashboards-excel data-analysis excel kpi visualization

Last synced: 19 Mar 2026

https://github.com/luminati-io/shopee-dataset-samples

A sample dataset of over 1000 Shopee products, extracted using the Bright Data API, ideal for pricing optimization, gap analysis, and market strategy refinement..

api data-analysis data-mining datasets products shopee web-scraping

Last synced: 12 Feb 2026

https://github.com/thevinh-ha-1710/diabetes-predictive-model

This project aims to train a predictive model to diagnose diabetes on women patients.

data-analysis data-science data-visualization model-training-and-evaluation python

Last synced: 13 Feb 2026

https://github.com/mikasenghaas/covid19-analysis

analysis of correlation between covid-19 infection numbers and weather data from the beginning of the pandemic until april 2021

data-analysis statistical-analysis

Last synced: 14 Feb 2026

https://github.com/gab-182/market-analysis-report-for-national-clothing-chain

Using custom M and DAX codes in Power BI, I conducte a thorough market analysis for a national clothing chain. The insights gathered from customer data and US Census Bureau statistics led to the formulation of a targeted marketing strategy, contributing to enhanced sales and customer satisfaction.

data-analysis power-bi

Last synced: 19 Mar 2026

https://github.com/edisedis777/duckdb-analyzer

A powerful tool for analyzing large CSV datasets using DuckDB.

csv data-analysis database duckdb

Last synced: 16 Apr 2026

https://github.com/wrighang/shipping-data-analysis

Independent Project: Transit time trends analysis following a major shipping process change.

data-analysis matplotlib numpy pandas python

Last synced: 18 Apr 2026

https://github.com/sunnybibyan/marketing_campaign_analysis_power_bi_dashboard

Campaign Performance Analysis This project analyzes the performance of Spring, Summer, and Fall marketing campaigns, revealing key insights and actionable recommendations.

data-analysis data-visualization dax marketing-campaign powerbi

Last synced: 19 Mar 2026

https://github.com/kalebers/economic_analysis_data_science

Data Analysis Python project using economy data base to predict percentage of good and bad payers

data-analysis data-science machine-learning pandas python scipy sklearn-library

Last synced: 18 Apr 2026

https://github.com/mayankyadav23/air-bnb-data-analysis

Data analysis and insights from NYC Airbnb listings, focusing on key metrics such as host performance, neighborhood trends, pricing, and customer reviews. Comprehensive documentation of ETL processes and analytical methodologies is provided. Perfect for understanding Airbnb dynamics and decision-making in the NYC market.

advanced-excel business-intelligence data-analysis data-analytics data-visualization power-bi ppt

Last synced: 19 Mar 2026

https://github.com/santiagortiiz/snowflake-data-warehousing

Snowflake University. Snowflake Data Warehousing. Foundamentals

big-data data-analysis data-warehouse olap snowflake

Last synced: 19 Mar 2026

https://github.com/steno-aarhus/legliv

Substitution of red meat with legumes and risk of primary liver cancer in UK Biobank participants: A prospective cohort study

cancer-research data-analysis epidemiology nutritional-epidemiology nutritional-science open-science reproducibility reproducible-research rstats ukbiobank

Last synced: 03 Mar 2026

https://github.com/wewoc/garmin_local_archive

Secure, local-first archive for Garmin Connect health data (HRV, sleep, activities). Private & offline. Structured for local analysis (Excel, HTML-Dashboard, Ollama, Open WebUI, AnythingLLM). Your data stays on your machine.

backup dashboard data-analysis fitness-tracker garmin garmin-connect ollama open-webui privacy privacy-enhancing-technologies privacy-first privacy-focused python self-hosted

Last synced: 16 Apr 2026

https://github.com/alfikiafan/air-quality-analysis

This repository contains a comprehensive data analysis project on Air Quality Dataset, covering the complete data analysis process from data gathering, cleaning, exploratory data analysis (EDA), to building a fully interactive dashboard using Streamlit.

air-quality data-analysis dicoding

Last synced: 17 Apr 2026

https://github.com/timmymatten/spikeball-stat-tracker

Spikeball stat tracking web app built with Streamlit and Python, designed to easily log and analyze player performance over multiple games.

data data-analysis data-visualization dataset matplotlib-pyplot multipage python spikeball statistics streamlit

Last synced: 18 Apr 2026

https://github.com/noodleslove/house-of-representative-analysis-i

This project uses public data about the stock trades made by members of the US House of Representatives.

data-analysis data-science eda kaggle-dataset matplotlib-pyplot pandas python stocks-trading

Last synced: 18 Apr 2026

https://github.com/adagio/ivoox_episodes

iVoox Episodes: Scraping & Analysis

beautifulsoup4 data-analysis ivoox pandas python python3 scraping

Last synced: 20 Apr 2026

https://github.com/aravind-selvam/bikeshare-company-analysis

Google Data Analytics Professional Certificate program's Capstone project, of a bike sharing company

analytics business-analytics business-intelligence data data-analysis data-visualization dataanalytics google-data-analytics postgresql sql sql-server

Last synced: 22 Apr 2026

https://github.com/ganeshkumartk/ncov-2019

[EDA] Statistical modelling of Novel Coronavirus breakout nCoV-2019

corona data-analysis ncov ncov-2019 statistics wuhan wuhan-coronavirus wuhan-virus

Last synced: 05 Jun 2026

https://github.com/chandansoren/diabetics_prediction

Predicting that whether the patient has diabetes or not on the basis of the features we will provide to our machine learning model.

data-analysis machine-learning python svm

Last synced: 06 Jun 2026

https://github.com/savinrazvan/degrees

A program that computes the "degrees of separation" between two actors by identifying the sequence of movies connecting them, inspired by the Six Degrees of Kevin Bacon game. Uses IMDb-based datasets for actors, movies, and their relationships.

actor-connections ai data-analysis degrees-of-separation educational-project graph-theory imdb movie-database python six-degrees-of-kevin-bacon

Last synced: 24 Apr 2026

https://github.com/leosimoes/uerj-tcc-analisador-dados

Trabalho de conclusão de curso (TCC) em Engenharia de Computação. Aplicativo Web para preparação e análise de dados, criação de gráficos e modelos de regressão linear e logistica.

computer-engineer data-analysis data-science data-visualization linear-logistic linear-regression python streamlit

Last synced: 24 Apr 2026

https://github.com/flyingfathead/neurograph-framework

A versatile tool for visualizing entropy loss in TensorFlow-based neural network training, providing insightful scatter plots with annotations.

data-analysis data-analysis-python data-visualization entropy graph graphs neural-network neural-networks neural-networks-visualization nn python python3 tensorflow tensorflow2 training visualization visualization-tools

Last synced: 24 Apr 2026

https://github.com/tejas-130704/whatsapp-analyser

ChatMate is a web app that analyzes WhatsApp chats, providing insightful visualizations like word clouds, heatmaps, and activity timelines. It calculates total messages, words, media, links, and more, helping you understand chat patterns for groups or individuals with ease. Simply upload your chat file and get detailed reports instantly!

data-analysis data-visualization python streamlit web-application whatsapp-analysis

Last synced: 26 Apr 2026

https://github.com/akarshankapoor7/tensorflow_tutorial

This is an easy and fast tutorial for tensorflow. In data science, TensorFlow is an open-source machine learning framework by Google. It's used for building and training machine learning and deep learning models.

data-analysis data-science deep-learning machine-learning tensorflow

Last synced: 27 Apr 2026

https://github.com/adrija-debnath/ideas-isi-data-science-internship

Topic of the Project - Predictive Maintenance Analysis, Data Science Internship at IDEAS - Institute of Data Engineering, Analytics and Science Foundation Technology Innovation Hub at Indian Statistical Institute, Kolkata.

data-analysis data-science predictive-analytics predictive-maintenance streamlit

Last synced: 27 Apr 2026

https://github.com/mrsamsonn/monolithic-polylithic-crystal-segmentation

A grid segmentation algorithm for clustering crystal structures using diffraction patterns. Useful in material science and nanotechnology, this code enables detailed analysis of crystals for research and industrial applications.

clustering crystal-structure crystallography data-analysis diffraction-patterns grid-segmentation image-processing k-means machine-learning matertial-science nanotechnology python research-project research-tools scientific-computing

Last synced: 28 Apr 2026

https://github.com/mgobeaalcoba/pandas_y_numpy

Trabajos realizados con estas librerías de Python para manejo de datos.

data-analysis data-science dataset numpy pandas python

Last synced: 28 Apr 2026

https://github.com/pheithar/socialdata_madridcentral

Social data and visualization course at DTU - 2022. Effectiveness of Madrid Central

data-analysis data-visualization jupyer-notebook madrid python

Last synced: 28 Apr 2026

https://github.com/adnanrahin/apache-spark-complete-reference

This repository reflects on all the necessary steps to take before jump in into Big Data.

big-data data-analysis data-science kaggle-dataset machine-learning rdd scala spark

Last synced: 29 Apr 2026

https://github.com/jhrcook/wagenmaker-data-analysis

Analysis of Registered Replication Report: Strack, Martin, & Stepper (1988) by Wagenmaker et al.

data-analysis r r-project statistics

Last synced: 08 Jun 2026

https://github.com/arielle0222/data_analysis

📊 Data analysis projects for autonomous driving and smart mobility engineering using Python and SQL.

autonomous-driving composite data-analysis electric-vehicles environmental-data python visualizatoin

Last synced: 30 Apr 2026

https://github.com/alcestide/scianalytics

Playground for Data Analysis and Visualization for Research and Scientifical Purposes with Pandas and Plotly.

csv data-analysis data-science data-visualization pandas plotly python science-research statistics

Last synced: 30 Apr 2026

https://github.com/nmsby/pca-machine-learning-lab

Principal Component Analysis (PCA) implementation and analysis lab for Machine Learning. Features manual PCA implementation, scikit-learn applications, data compression, and feature extraction with detailed visualizations.

data-analysis dimensionality-reduction jupyter-notebook machine-learning numpy pca python scikit-learn visualization

Last synced: 01 May 2026

https://github.com/lisa-ho/breadit

Respository for scraping and analysing data from the Reddit/Sourdough community to explore lockdown baking trends.

data-analysis data-viz nltk python reddit-api sentiment-analysis web-scraping

Last synced: 01 May 2026

https://github.com/nafisalawalidris/data-analysis-with-python

This repo features Jupyter Notebook labs for learning data analysis with Python. Explore data acquisition, wrangling, visualization, modeling, and evaluation. Enhance your skills in Python data analysis.

data-acquisition data-analysis data-science data-wrangling exploratory-data-analysis feature-engineering machine-learning model-development model-evaluation-and-refinement pandas

Last synced: 02 May 2026

https://github.com/emso-exe/reclamacoes_de_consumidores_com_empresa_de_telecomunicacoes

Projeto de análise de reclamações de consumidores com empresa de telecomunicações no 1º semestre de 2021 com base nos dados do site consumidor.gov.br.

analise-de-dados ciencia-de-dados data-analysis data-science datascience python python-3 python3

Last synced: 02 May 2026

https://github.com/seankwarren/water-quality-analysis

An examination of water quality in the Atlanta watershed with a focus on identifying neglected areas and potential strategies for improving water quality monitoring

analytics data-analysis jupyter-notebook python

Last synced: 03 May 2026

https://github.com/fybex/chatgpt-conversations-analysis

Analysis of 89,000 ChatGPT conversations to understand interaction patterns and response behaviors.

chatgpt conversation-analysis data-analysis data-visualization language-analysis prompt-patterns sentiment-analysis

Last synced: 02 May 2026

https://github.com/ferrangarciarovira/premier-league-betting-analysis

Comprehensive Python analysis of Premier League betting market inefficiencies (2005–2024). Evaluates bookmaker biases, betting strategies, and market efficiency using statistical methods and Monte Carlo simulations.

betting-strategies bias-detection data-analysis market-efficiency monte-carlo-simulation premier-league python sports-analytics

Last synced: 03 May 2026

https://github.com/cs-joy/pandasv2.0.3

learn data analysis with pandas

data-analysis pandas pandas-learning

Last synced: 03 May 2026

https://github.com/anandanraju/youtube-data-api-model

The YouTube Analytics API enables you to generate custom reports containing YouTube Analytics data. The API supports reports for channels and for content owners. Report fields are characterized as either dimensions or metrics

analytics data-analysis data-science metrics model python telemetry youtube youtube-api

Last synced: 03 May 2026

https://github.com/theairbend3r/mice-memory-response

Effect of memory on current response in mice using methods from computational neuroscience and machine learning.

computational-neuroscience data-analysis data-science machine-learning neuroscience python

Last synced: 09 Jun 2026

https://github.com/zeynepcol/data-analysis-visualization

Data visualization and interactive analytics - Olympics Dataset

data-analysis data-science data-visualization matplotlib pandas plotly python scipy seaborn streamlit

Last synced: 03 May 2026

https://github.com/0xjeremy/me-18-final

Data collection and Analysis tools for IMUs

data-analysis imu raspberry-pi

Last synced: 03 May 2026

https://github.com/cliffordnwanna/climate_data_analysis_and_visualization

The Climate Data Visualization and Analysis project showcases a comprehensive exploration of African climate data, with an emphasis on identifying patterns and trends in average temperatures across different countries and time periods. It serves as a practical demonstration of my data analysis, data visualization, and problem-solving skills.

data-analysis data-science mathplotlib pandas plotly visualization

Last synced: 04 May 2026

https://github.com/ruchit0807/heart_disease_prediction

An interactive ML-powered web app that predicts the risk of heart disease based on clinical inputs like age, chest pain, cholesterol, ECG, and more. Built using Python, Streamlit, and scikit-learn, it offers early risk assessment in a simple and accessible way—just enter your health metrics and get instant feedback.

data-analysis data-science knn-regression pandas streamlit

Last synced: 04 May 2026

https://github.com/scarblase/sales_insights

A data-driven analysis of 15,000 sales records using Python, Pandas, and visualizations to uncover trends, optimize strategies, and enhance business performance. 🚀📊

data-analysis data-visualization dataset matplotlib-pyplot pandas python3 sales-analysis seaborn

Last synced: 05 May 2026

https://github.com/shuddha2021/stellar-candidate-selector

A sophisticated candidate selection algorithm leveraging multi-criteria analysis and machine learning to identify top software engineering candidates. This tool features flexible filtering, score adjustment, and detailed visualizations to streamline the recruitment process.

candidate-selection data-analysis data-visualization machine-learning pandas plotting-in-python python python-data-analysis recruitment scikit-learn

Last synced: 05 May 2026

https://github.com/githubuseraccountamazing/the-amari-project

a project in which I attempted to push some of the limits of stable-diffusion while taking some data along the way

ai ai-generated-images bash data-analysis machine-learning stable-diffusion textual-inversion

Last synced: 05 May 2026

https://github.com/kirkalyn13/opensignal_autogenerate_report

Script used to generate results/summary, including the trends of flagged provinces, from the raw excel data file,

data-analysis data-science data-visualization matplotlib numpy pandas python

Last synced: 06 May 2026

https://github.com/amirhosseinhonardoust/customer-sentiment-intelligence-platform

An enterprise-grade NLP + Streamlit + SQL platform for analyzing customer feedback. Performs automated sentiment detection, stores labeled reviews in SQLite, and delivers real-time dashboards with probability insights to support business, marketing, and product optimization decisions.

community-project cost-of-living dashboard data-analysis data-visualization economic-analysis inflation-tracking local-data open-data pandas price-tracker public-insight python sqlite streamlit

Last synced: 06 May 2026

https://github.com/backdoorali/insider-threat-detection-project

Personal data analysis project combining insider threat detection, cybersecurity, and exploratory data analytics. Built for portfolio showcase and practical skills demonstration.

cybersecurity data-analysis data-analysis-excel data-analysis-project data-analyst data-analytics data-visualization eda excel insider-threat jupyter-lab jupyter-notebook matplotlib numbers pandas portfolio-project python python3 threat-detection threat-intelligence

Last synced: 07 May 2026

https://github.com/md-emon-hasan/data-science

Data science tutorials, including data preprocessing, analysis, visualization, project deployment, machine learning and deep learning algorithms.

artificial-intelligence data-analysis data-engineering data-science deep-learning machine-learning-algorithms python

Last synced: 07 May 2026

https://github.com/geo-y20/loan-approval-automation-using-mongodb-and-pymongo

This project demonstrates the implementation of a loan approval system that utilizes MongoDB for distributed data storage and management, and PyMongo for database operations. The project aims to automate the assessment of loan eligibility using customer details from online applications.

crud-application data data-analysis data-science data-visualization deployment jupyter-notebook loan-default-prediction loan-prediction-analysis machine-learning machine-learning-algorithms matplotlib mongodb pymongo streamlit web

Last synced: 08 May 2026

https://github.com/shridhar1504/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization datanalysis eda kmeans-clustering machine-learning python sql sql-server unsupervised-learning

Last synced: 08 May 2026

https://github.com/miroslav-reiter/kurz_jazyk_sql_analytici_datovi_vedci

Materiály ku kurzu Jazyk SQL 1 pre Analytikov a Dátových Vedcov

analysis analytics data data-analysis data-science database mysql reiter sql

Last synced: 08 May 2026

https://github.com/md-emon-hasan/data_analytics_project

Data analytics tasks and solutions, featuring hands-on exercises for data cleaning, visualization, and analysis using Python libraries.

cars-dataset census-data covid19-data data-analysis london-house-price police-data weather-data

Last synced: 08 May 2026

https://github.com/mariam-badr-mb/gtc-ml-project2-diabetes-prediction

This project is part of the GTC Machine Learning Program. It demonstrates the end-to-end ML workflow by building a predictive model for diabetes detection

classification-algorithm data-analysis data-visualization diabetes-prediction gridsearchcv hyperparameter-tuning machine-learning python

Last synced: 09 May 2026

https://github.com/gabrielmpinho/cs50-sql

Solutions and notes from CS50’s Introduction to Databases with SQL. Covers CRUD operations, data modeling, normalization, joins, views, indexes, and connecting SQL with Python and Java. Begins with SQLite for portability and introduces PostgreSQL and MySQL for scalability.

data-analysis data-structures data-visualization database databases javascript python sql

Last synced: 10 May 2026

https://github.com/chayandatta/got_script_manipulation

Game of Thrones Script - String & file manipulation

data-analysis data-science pandas python3

Last synced: 11 May 2026

https://github.com/easycris-software/easycris

Professional statistical analysis and RNA-seq for researchers — no coding required

anova bioinformatics data-analysis desktop-app genomics pharmacology research-tools rna-seq statistics tauri

Last synced: 11 May 2026

https://github.com/zpreisler/modules

Python libraries and modules for processing simulation outputs

data-analysis python scripts tensorflow

Last synced: 13 May 2026

https://github.com/soufianboukir/ecom-analytics-platform

End-to-end data science project on an Amazon sales dataset, including data preprocessing, analysis, modeling, and a Streamlit dashboard for insights and decision-making.

data-analysis data-science data-visualization data-visualization-dashboard forecasting-models timeseries

Last synced: 14 Jun 2026

https://github.com/ensinho/data-analysis

My repository for data analysis studys in Python.

csv data-analysis graphs python python-documentation

Last synced: 15 Jun 2026