An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/pinedah/escom_development-of-applications-for-data-analysis

This repository is a personal collection of programs, exercises, and notes from the Development of Applications for Data Analysis course at Instituto Politécnico Nacional (IPN). As part of the Bachelor's in Data Science, the course focuses on developing practical skills in Python for data analysis.

data-analysis data-science data-visualization jupyter-notebook python python-data-analysis

Last synced: 20 Jan 2026

https://github.com/gappeah/british-airways-analysis

This project focuses on analyzing and visualising travel data from British Airways using Tableau. The goal is to extract insights and present them in an interactive and visually appealing manner.

data data-analysis data-visualization tableau

Last synced: 11 Jun 2025

https://github.com/tbep-tech/red-tide-twitter

Supplementary materials to accompany Skripnikov et al. red tide Twitter analysis

ccmp-li1 ccmp-wq3 data-analysis open-science tampa-bay tberf water-quality

Last synced: 19 Feb 2026

https://github.com/nafisalawalidris/logistic-regression-model-for-breast-cancer-recurrence-prediction

Predicting Breast Cancer Recurrence - A logistic regression model using patient attributes to classify recurrence risk. Dataset analysis and model evaluation. Contributions welcome.

breast-cancer classification-model data-analysis data-science healthcare logistic-regression machine-learning python recurrence-prediction scikit-learn

Last synced: 17 May 2026

https://github.com/aliciagilmatute/analisis-multinivel-bayesiano

Este estudio explora el análisis multinivel desde un enfoque bayesiano para evaluar la variabilidad del rendimiento en matemáticas entre 10 centros educativos

bayesian-statistics cmdstanr data-analysis hierarchical-models multilevel-models rstats rstudio stan

Last synced: 30 Oct 2025

https://github.com/kushagrakumar04/traffic-accident-analysis

This project analyzes traffic accident data to identify patterns based on road conditions, weather, and time of day. Visual representations of accident hotspots and contributing factors are created to offer a comprehensive understanding of the dynamics involved. The insights from this analysis aim to develop targeted strategy to improve safety.

data-analysis matplotlib pandas visualization

Last synced: 15 May 2026

https://github.com/codingprivacy/feedback-portal-system

AI based Feedback Portal System which takes periodic feedbacks from users via highly human friendly chat-bot, analyse the responses through NLP and sentiment analysis and visualize the analysis on the portal website.

artificial-intelligence bokeh chatbot data-analysis flask mysql-database nlp portal python sentiment-analysis visualization website

Last synced: 19 Sep 2025

https://github.com/ajwad-shaikh/sristi-sanshodh-collect

SRISTI Sanshodh Collect is an Android app for filling out forms. It's been used to collect billions of data points in challenging environments. Contribute and make the world a better place! ✨📋✨ https://docs.opendatakit.org/collect-…

collect data-analysis data-collection javarosa odk opendatakit

Last synced: 04 Apr 2025

https://github.com/jcaperella29/financial-data-scraper

Financial Data Scraper is a Python-based web scraping tool using Selenium to extract financial data from Stock Analysis. It scrapes Income Statement, Balance Sheet, Cash Flow, and Ratios for multiple companies and saves them as CSV files.

automation data-analysis finance financial-statements investment python selenium stock-market web-scraping

Last synced: 28 Jul 2025

https://github.com/kumaranand05/suicide-rate-analysis

Analysis of Mortality data of WHO and visualization using Power BI

analytics data-analysis data-visualization mortality-rates powerbi python suicide-dataset suicide-rate

Last synced: 04 May 2026

https://github.com/samruddhi3012/shopping-habits-customer-behavior-analysis

Hello there! This repo contains python project based on E-Commerce Customer Behavior analysis.

customer-segmentation customerbehavior data-analysis ecommerce python

Last synced: 29 Mar 2025

https://github.com/andystmc/nextflownyc

Developed a machine learning model (Bidirectional LSTM) to forecast NYC traffic volumes using 10 years of automated traffic count data. Achieved strong predictive accuracy, demonstrating the power of deep learning for urban traffic analysis.

data-analysis data-cleaning data-science data-visualization exploratory-data-analysis feature-engineering hyperparameter-tuning jupyter-notebook lstm-neural-networks machine-learning numpy pandas predictive-modeling python3 scikit-learn tensorflow-keras traffic-flow-forecasting

Last synced: 07 Apr 2026

https://github.com/p2-718na/alice-simulation

Code for my Lab-2 course.

cern-root data-analysis

Last synced: 13 Mar 2025

https://github.com/akash1070/data-science-virtual-internship-by-accenture

data merging and data cleaning in python as well as data visulaisation with dashboard in Tableau.

data-analysis data-cleaning data-science python3 tableau visualization

Last synced: 15 May 2026

https://github.com/unrndm/dataanalysis

artifacts and sollutions of homework for course "Data Analysis" in Magistrate of HSE during 2023-2024

2023-2024 data-analysis hse

Last synced: 27 Mar 2025

https://github.com/nafisalawalidris/dr.-semmelweis-and-the-discovery-of-handwashing

Uncover the revolutionary impact of handwashing on mortality rates in healthcare. Explore the story of Dr. Semmelweis and his groundbreaking findings.

data data-analysis handwashing healthcare-analysis medical-breakthrough mortality-rates

Last synced: 13 Jul 2025

https://github.com/thecoderpinar/customer-segmentation-clv-analysis

Optimize marketing strategies and enhance decision-making. Explore customer data, segment behavior, calculate CLV, analyze demographics, and visualize insights. 🚀

clv-analysis customer-segmentation data-analysis data-science data-visualization jupyter-notebook machine-learning marketing-strategy python

Last synced: 03 Apr 2025

https://github.com/swarchal/morar

Processing phenotypic screening data

biology data data-analysis drug-discovery hts phenotypic

Last synced: 19 Jun 2025

https://github.com/nysportsfan/Gun-Violence-in-the-US

This repository contains all the relevant files for my first capstone project as part of the Springboard Data Science Career Track.

data-analysis data-science data-visualization machine-learning python3 statistics

Last synced: 10 May 2025

https://github.com/sufiyanahmed4566/sql-musicmaven

"This Music Store Database Project showcases SQL skills through comprehensive database design, query optimization, and data analysis. Includes ER diagram, database file, query questions (Easy, Medium, Hard), answered queries, and CSV table data. Ideal for recruiters seeking skilled SQL developers for music store management and data analysis.

data-analysis database insights mysql-database oracle-database relational-databases sql

Last synced: 18 May 2026

https://github.com/pavelgrigoryevds/olist-deep-dive

🌊 Deep Sales Analysis of Olist E-Commerce: EDA | Time Series| Viz | RFM | NLP | Geospatial | Segmentation & Actionable Business Recommendations.

business-recommendations clusterization data-analysis data-analytics data-science deep-analysis e-commerce eda feature-engineering geospatial jupyter-notebook nlp pandas plotly preprocessing python rfm statistics time-series visualization

Last synced: 07 May 2026

https://github.com/numbersprotocol/dyda

Dynamic data pipeline framework

ai artificial-neural-networks data-analysis data-science

Last synced: 07 Nov 2025

https://github.com/archived-blueprints/amazonathena-blueprints

Simplified blueprints for building data pipelines with Amazon Athena.

amazon-athena athena cli data-analysis data-engineering data-science elt etl

Last synced: 29 Jul 2025

https://github.com/mijisu0103/ukhsa-dashboard-project

Simple dashboard that downloads and displays the data about infectious diseases (Influenza, Rhinovirus and COVID-19) from the UK Health Security Agency (UKHSA) dashboard.

data-analysis data-visualisation ipywidgets python voila-dashboard

Last synced: 17 Jun 2025

https://github.com/rayyan9477/youtube-spam-detection-with-flask-and-machine-learning

This is a web application built using Flask that detects spam comments on YouTube using a Naive Bayes classifier. It leverages techniques such as CountVectorizer for feature extraction and scikit-learn for machine learning. The application reads data from a CSV file and predicts whether a comment is spam or not.

data-analysis data-science machine-learning nlp-machine-learning spam-detection

Last synced: 21 Sep 2025

https://github.com/atharvbyadav/expensemate

A simple, lightweight personal finance tracker built with Streamlit and SQLite. Log expenses, visualize spending habits, manage budgets, and download reports – all through an interactive web interface.

budgeting data-analysis data-visualization expense-tracker finance-app open-source pandas personal-finance plotly python sqlite streamlit streamlit-webapp

Last synced: 28 Apr 2026

https://github.com/hemanthkumarsunkari27/pmay_analysis_project

Built for the 1st AI for Good Hackathon by Snowflake, this project uses data analytics and AI to explore housing and sanitation trends in India under PMAY. Using Snowflake and Streamlit, it provides interactive insights into regional disparities, helping guide sustainable infrastructure development.

data-analysis data-visualization pmay-analysis sanitation-coverage snowflake-integration streamlit-dashboard sustainable-development

Last synced: 26 Mar 2025

https://github.com/tynoee/covid19_data_analysis

This is an analysis of Covid 19 dataset using multiple SQL queries. The dataset used for this analysis includes various information regarding COVID-19 cases such as confirmed cases, deaths, and recoveries, segmented by different geographical locations and time periods.

data-analysis excel sql sqlserver-2019 tableau tableau-public

Last synced: 16 Feb 2026

https://github.com/vkbo/osirisanalysis

Matlab toolbox for analysing simulation results from Osiris 3

data-analysis matlab matlab-gui physics-simulation

Last synced: 10 May 2025

https://github.com/makosai/covid19datachart

A basic chart for checking corona data. Written in a single HTML file for convenience. Grab the single file and run it anywhere. Or visit the webpage.

chart chartjs corona coronavirus coronavirus-analysis covid-19 covid-2019 covid19 covid19-data data data-analysis datasets

Last synced: 23 Feb 2026

https://github.com/gad-dimnt-cptec/scanplot

Um sistema de plotagem simples para o SCANTEC

data-analysis jupyter-notebook pandas python scantec

Last synced: 17 Jan 2026

https://github.com/patilni3/seaborn-in-depth

Python's Seaborn Library for Data Analysis, Machine Learning, Data Science and many more...

data-analysis data-reporting data-representation data-science data-visualization plots-in-python powerbi seaborn sns

Last synced: 03 Apr 2025

https://github.com/patilni3/numpy-in-depth

Python's NumPy Library for Data Analysis, Machine Learning, Data Science and many more...

data-analysis data-engineering data-science machine-learning numpy pandas

Last synced: 10 May 2026

https://github.com/docuvesta/la-prairie-luxury-skincare-makeup-analysis

Web scraping La Prairie skincare regional websites for brand and product insights 🛍️

cosmetics data-analysis data-analytics data-visualization jupyter-notebook luxury python science skincare

Last synced: 19 Apr 2026

https://github.com/nishchal-kansara/loan_eligibility_prediction

This project aims to create a robust machine learning model that accurately predicts an applicant's eligibility for a loan based on various features such as income, credit history, and marital status.

data-analysis data-cleaning data-science data-visualization datascience dataset loan-eligibility

Last synced: 23 Jun 2026

https://github.com/gappeah/london-housing-price-dashboard

This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.

data data-analysis data-visualization excel visual

Last synced: 31 Jul 2025

https://github.com/quantumudit/sales-statistical-analysis

This project focuses on a statistical analysis (using SQL queries) of various key metrics that impacts the overall sales of a certain fictitious store.

data-analysis postgresql sales-analysis sql statistics

Last synced: 16 May 2026

https://github.com/thecoderpinar/telecommunication-customer-churn-analysis-and-prediction

📊 This project focuses on customer churn analysis and prediction in the telecommunications sector. Using data analysis, modeling, and predictive techniques, it aims to understand and mitigate customer loss by developing strategies.

churn churn-prediction classification customer data-analysis data-science deep-learning machine-learning neural-network telecom

Last synced: 07 Aug 2025

https://github.com/dannyben/datamix

DSL for manipulating tabular data

csv data data-analysis data-engineering gem ruby tabular-data

Last synced: 31 Jul 2025

https://github.com/michellepellon/jobx

A modern, powerful job scraper for LinkedIn, Indeed and beyond.

compensation data data-analysis indeed indeed-scraping jobs jobsearch linkedin linkedin-scraper

Last synced: 17 Jan 2026

https://github.com/banyc/dfsql

SQL REPL/lib for Data Frames

cli csv data-analysis jsonl ndjson repl sql

Last synced: 31 Jul 2025

https://github.com/tim-hub/python-course

A new Python Course, a new trial to offer MOOC style learning resources and content for python learners

data-analysis learning python

Last synced: 17 Mar 2025

https://github.com/myself-aas/quantium_data_analytics_forage

This project analyzes retail customer chip purchasing behavior using Python, focusing on customer segmentation and key spending drivers to provide data-driven insights for strategic category management recommendations.

data-analysis data-engineering data-science data-visualization feature-engineering forage internship-project matplotlib-pyplot numpy-library pandas-dataframe pearson-correlation python quantium-virtual-experience scipy-stats seaborn

Last synced: 31 Jul 2025

https://github.com/vitia-fritelle/ipynb_converter

Jupyter notebook to Python file conversor

data-analysis data-science jupyter-notebook python

Last synced: 28 Apr 2026

https://github.com/mgobeaalcoba/analisis_con_r

Trabajos de análisis realizados con lenguaje R

data-analysis data-science dataset r r-package r-programming r-studio

Last synced: 21 May 2026

https://github.com/ibensusan/wine-properties-assessment

Wine Properties Assessment using Microsoft Excel

data-analysis data-visualization excel

Last synced: 20 Mar 2026

https://github.com/csoren66/diabetics_prediction

Predicting that whether the patient has diabetes or not on the basis of the features we will provide to our machine learning model.

data-analysis machine-learning python svm

Last synced: 03 Mar 2025

https://github.com/ncasuk/decades-pp

Post processing library for the data from the FAAM aircraft

atmospheric-sciences data-analysis data-processing meteorology science

Last synced: 07 Mar 2026

https://github.com/mynenik/xyplot-win32

XYPLOT Plotting and Data Analysis Program for 32-bit Windows

cpp data-analysis data-manipulation data-visualization forth mfc windows-app

Last synced: 18 Mar 2025

https://github.com/gmbeddard/ee152-realtime_embedded_systems-finalproject

An STM32-based implementation of the Pan-Tompkins algorithm for real-time QRS detection. Includes robust debugging tools, heart rate monitoring, and live ECG signal support via a python graphing script.

cpp-programming data-analysis ecg embedded-c freertos stm32

Last synced: 21 Apr 2026

https://github.com/foxriver76/iobroker.intelliflow

Stream data analysis adapter for ioBroker.

data-analysis iobroker machine-learning streaming-data

Last synced: 04 Apr 2025

https://github.com/frankelavsky/political-polarization-challenge

I had 8 hours to build a solution to the research claim that "politics have become more divided in the past 50 years." You can navigate views of congressional voting patterns using arrows. I used d3, require, MVC pattern, and vanilla js. Pre-processed the data in node.js. Data is from DW-NOMINATE: ftp://k7moa.com/junkord/HANDSL01114A20_STAND_ALONE_30.DAT

client-side css d3 d3js data-analysis data-visualization frontend frontend-app html interactive interactive-visualizations javascript modular nodejs political-science politics requirejs research single-page-app visualization

Last synced: 06 Apr 2026

https://github.com/victorherdz10/rainsense-iot

Sistema IoT de detección temprana de lluvias con Arduino. Monitorea condiciones meteorológicas usando sensores DHT22/BMP280 y algoritmos de predicción multivariable para alertas en tiempo real. Procesa datos y los envía via HTTP/JSON.

arduino bmp280 data-analysis dht22 embedded-systems iot platformio rain-detection real-time sensor-network weather-prediction weather-station

Last synced: 17 Apr 2026

https://github.com/yard1/linearordering

An R package. Provides various methods of linear ordering of data. Supports weights and positive/negative impacts.

data-analysis data-analysis-in-r data-analysis-r data-science r

Last synced: 21 May 2026

https://github.com/sharathsphd/coffee_causality

Data-driven analysis of coffee shop sales using correlation, regression, and causal inference. A Jupyter Book project exploring foot traffic, weather patterns, and business analytics.

business-analytics causal-inference correlation data-analysis foot-traffic forecasting github-pages jupyter-notebook machine-learning open-source python regression retail-analytics statistics storytelling time-series visualization weather-analysis

Last synced: 18 May 2026

https://github.com/phillbertnevinemmanuel/coviddeathvaceda

an exploratory data analysis based on dataset of covid statisics from 2020-2022

data-analysis database sql

Last synced: 09 Apr 2025

https://github.com/patilni3/matplotlib-in-depth

Python's Matplotlib Library for Data Analysis, Machine Learning, Data Science and many more...

data-analysis data-representation data-science data-visualization matplotlib matplotlib-pyplot plots-in-python powerbi seaborn

Last synced: 03 Apr 2025

https://github.com/vivienneforreal/covid4eu-sorbonne

Economy: “Analysis of Labor Market decisions of men and women during the COVID-19 pandemic in the 4EU+ countries”.

covid-19 data-analysis data-science data-visualization pandas

Last synced: 20 Mar 2025

https://github.com/fbraza/paris_airbnb

Analysis of Paris AirBnB data using R and Shiny

analysis data data-analysis paris-airbnb r shiny

Last synced: 21 Mar 2025

https://github.com/hafeez-urrehman/mental-health-analyzer

Mental-Health-Analyzer is an AI-Based project for predicting mental health disorders such as stress, anxiety, depression, and loneliness. By applying machine learning techniques, this project analyzes user inputs and behavioral data to provide accurate predictions, aiming to support mental well-being and early intervention.

data-analysis data-science early-diagnonosis machine-learning mental-health mental-wellbeing predictive-modeling python

Last synced: 17 May 2026

https://github.com/diacod-i/bournetokill

Analysis on inhibition assay data for Monoamine Oxidase protein family

data-analysis data-science data-visualization python3

Last synced: 21 May 2026

https://github.com/cosmoduende/r-uber-trips-analyisis

Explore your activity on Uber with R: How to analyze and visualize your personal data history. Find out how you consume the Uber App using a copy of your data.

analisis-de-data data-analysis data-analytics data-science data-visualisation data-visualization data-viz eda flexdashboard ggmap ggplot2 mobility-as-a-service qmplot r-language r-programming ridesharing uber uber-data visualizacion-de-datos

Last synced: 14 Jul 2025

https://github.com/shubhamgoyal575/diwali-sankranti-promotion-sales

This Power BI dashboard analyzes sales performance during Diwali and Sankranti festivals. It provides insights into revenue trends, top-selling products, regional sales distribution, and customer purchasing behavior to help optimize festive season sales strategies. 🚀

buisness-intelligence dashboard data-analysis data-visualization diwali-sankranti-sales-analysis excel fast-moving-consumers-goods fmcg microsoft-power-bi mysql power-query powerbi revenue-insights sales-dashboard sales-insights sql

Last synced: 02 Mar 2026

https://github.com/athul64/exploratory-data-analysis

To preprocess and analyze the given employee dataset, present the findings graphically, and derive meaningful insights to help better understand the company’s workforce.

colab-notebook data-analysis data-visualization matplotlib numpy pandas python seaborn statistical-analysis

Last synced: 25 Feb 2026

https://github.com/grypesc/graduateadmissions

Visualization, analysis and predictive modeling of a Kaggle graduate admissions dataset.

data-analysis data-mining data-science data-visualization dataset

Last synced: 08 Jul 2025

https://github.com/markoshb/machine-learning-subject

Implementation of multiclass classification problems in R

classification-model data-analysis r

Last synced: 14 Mar 2025

https://github.com/dogoncouch/dhcptranslate

Parses ISC DHCP server config, performs DNS resolution as needed, and outputs lease data in CSV format.

configuration csv-format data-analysis isc-dhcp isc-dhcp-server migration-tool

Last synced: 20 Mar 2025

https://github.com/onome-joseph/ml-fraud-dectection

This project is designed to identify fraudulent transactions with high accuracy.

classfication-model data-analysis data-science machine-learning problem-solving

Last synced: 06 Apr 2025

https://github.com/v6ntage/sql-sales_data-analytics-project

This repository contains a SQL scripts demonstration analytical techniques.

analytics business-analytics data data-analysis database query sql sql-server

Last synced: 12 Apr 2026

https://github.com/jahnavigupta06/zepto-delivery-customer-analytics

Real-time SQL + Power BI Analytics Project replicating Zepto's customer & delivery insights.

business-intelligence churn-analysis customer-segmentation data-analysis data-visualization powerbi sql-server

Last synced: 02 Aug 2025

https://github.com/ituvtu/Data-Science-AB-Testing

This project focuses on conducting A/B testing to evaluate the effectiveness of two marketing campaigns. Using statistical analysis and hypothesis testing, we determine which campaign is more effective in improving conversion rates.

a-b-testing data-analysis data-analysis-python data-mining ipynb jupyter jupyter-notebook python

Last synced: 26 Sep 2025

https://github.com/rdrahul123/sales-dashboard

The Sales Analysis Dashboard was developed to provide insights into sales, profits, and product performance across different categories, timeframes, and geographic locations. By leveraging Power BI, the project aimed to transform raw data into actionable visualizations, facilitating better decision-making for stakeholders.

data-analysis data-science data-visualization dax powerbi

Last synced: 06 Jan 2026

https://github.com/rayyan9477/multiple-disease-prediction-system

This repository contains a Multiple Disease Prediction System leveraging machine learning techniques for accurate predictions. It utilizes Python, Pandas, Scikit-learn, and Flask for data preprocessing, model building, and web deployment. Explore the project and connect on LinkedIn for collaborations.

data-analysis data-science machine-learning python streamlit

Last synced: 10 Apr 2026

https://github.com/shriram-vibhute/data-analysis

This repository offers a comprehensive collection of data analysis techniques using NumPy Pandas, Matplotlib and Seaborn.

data-aggregation data-analysis data-visualization data-wrangling matplotlib numpy pandas seaborn

Last synced: 02 Aug 2025

https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform

This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.

bigquery data data-analysis data-modeling live-streaming sql

Last synced: 23 Jun 2025