An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/priyanshubiswas-tech/data-analysis-with-python

This repository showcases Python projects completed for a Data Analysis with Python certification, demonstrating skills in data manipulation, visualization, and statistical analysis using libraries like NumPy, Pandas, Matplotlib, Seaborn, and SciPy.

data-analysis demographic-data-analyzer mean-variance-standard-deviation-calculator medical-data-visualizer page-view-time-series-visualizer python scipy-stats sea-level-predictor seaborn

Last synced: 07 May 2025

https://github.com/priyanshubiswas-tech/airflow_dbt_superset_project

End-to-end ITSM data engineering pipeline using PostgreSQL, DBT, Airflow, and Superset. Covers ingestion, cleaning, transformation, orchestration, and visualization, validated across Docker Toolbox and Docker Desktop environments.

apache-airflow apache-superset dags data-analysis dbt docker etl etl-automation etl-pipeline postgresql

Last synced: 07 May 2025

https://github.com/draym/swmanager

Web-app to help you in your daily life raids in SpacesWars thanks to game statistics and data management

dashboard-application data-analysis data-visualization game-data game-utility

Last synced: 19 Jun 2025

https://github.com/drcbeatz/aynm-data

Python scripts for data cleaning and processing for AYNM (Pandas/NumPy/Selenium/AWS Textract)

automation aws-textract csv data-analysis data-cleaning ipynb numpy ocr pandas python reverb selenium shopify webscraping xml

Last synced: 07 Mar 2026

https://github.com/luizassimoes/bachelor-thesis

Bachelor Thesis developed for the completion of the graduation in Electrical Engineering.

5g-networks data-analysis data-visualization python

Last synced: 30 Apr 2026

https://github.com/kushagrakumar04/traffic-accident-analysis

This project analyzes traffic accident data to identify patterns based on road conditions, weather, and time of day. Visual representations of accident hotspots and contributing factors are created to offer a comprehensive understanding of the dynamics involved. The insights from this analysis aim to develop targeted strategy to improve safety.

data-analysis matplotlib pandas visualization

Last synced: 15 May 2026

https://github.com/navdeep-g/data-quality-checker

A comprehensive Python tool for data analysis and data quality

data-analysis data-science pandas python

Last synced: 16 May 2026

https://github.com/abhaysingh71/laptop-price-predictor

Laptop Price Predictor is a Dockerized machine learning project that predicts laptop prices based on specs using ensemble models like Random Forest, XGBoost, and Gradient Boosting.Including Streamlit UI, and full Docker support.

data-analysis data-science deployment docker docker-image ensemble-learning laptop-price-prediction machine-learning-algorithms streamlit xgboost

Last synced: 05 May 2026

https://github.com/atxtechbro/glassdoorwebscraping

"Scraping Glassdoor: A GraphQL Journey" is an advanced data harvesting tool leveraging GraphQL and an API-first strategy to extract and analyze Glassdoor data for business intelligence and predictive analytics.

api-first-approach business-intelligence data-analysis data-harvesting data-mining data-science glassdoor-scraper graphql html machine-learning performance-optimization predictive-analytics python requests-library-python scaleability scraper system-design web-scraping

Last synced: 16 May 2026

https://github.com/thanaphongk37/data-science-and-data-analyst-project

Portfolio Data Analysis and Data Science projects and Data Engineer built using Azure Service, SQL and Python.

apache-superset azure-storage dashboards data-analysis data-science databricks dataengineering datafactory datapipeline powerbi python sisense sql sql-server visualization

Last synced: 11 May 2026

https://github.com/valeriopagliarino/esp2-2021-unito-public

Physics laboratory 2 course (electromagnetism, optics and modern physics)

data-analysis electronics optics physics

Last synced: 22 Jun 2025

https://github.com/oguzgn/budget-checker-for-campaign-budget-allocation

This project focuses on modeling campaign performance data for Looker, helping determine which campaigns to scale up or cut back. It aggregates metrics over the last 7 and 30 days, providing actionable insights for budget optimization and performance improvement.

budget-allocation budget-controller budget-management calculated-fields campaign-analytics data-analysis data-modeling looker-studio sql

Last synced: 17 Feb 2026

https://github.com/nafisalawalidris/dr.-semmelweis-and-the-discovery-of-handwashing

Uncover the revolutionary impact of handwashing on mortality rates in healthcare. Explore the story of Dr. Semmelweis and his groundbreaking findings.

data data-analysis handwashing healthcare-analysis medical-breakthrough mortality-rates

Last synced: 13 Jul 2025

https://github.com/cosmoduende/r-uber-trips-analyisis

Explore your activity on Uber with R: How to analyze and visualize your personal data history. Find out how you consume the Uber App using a copy of your data.

analisis-de-data data-analysis data-analytics data-science data-visualisation data-visualization data-viz eda flexdashboard ggmap ggplot2 mobility-as-a-service qmplot r-language r-programming ridesharing uber uber-data visualizacion-de-datos

Last synced: 14 Jul 2025

https://github.com/deepanshkhurana/cloudsimplifier

Simple helper functions to fetch and read data from various formats stored on Amazon AWS S3 Buckets. Most functions are essentially wrapping over cloudyR.

amazon aws cloudyr data-analysis data-fetching data-science package r rpackage s3

Last synced: 20 May 2026

https://github.com/ahmednurabdii/data-analytics-portfolio-superstore

My first portfolio project showcasing data cleaning, analysis, and visualization of Superstore sales data.

data-analysis data-visualization jupyter-notebook matplotlib numpy pandas portfolio-project python sales-analysis scipy seaborn superstore-dataset

Last synced: 07 Apr 2026

https://github.com/naruaika/eruo-data-studio

A powerful yet friendly ETL tool powered by Polars backend

data-analysis data-science desktop-app gnome-desktop gtk4 proof-of-concept python spreadsheet

Last synced: 18 Jul 2025

https://github.com/maheshthedev/twitter-analysis

Analysis on Various Topics with Twitter Data

data-analysis twitter-analysis

Last synced: 18 Jul 2025

https://github.com/vara-co/pandas-challenge

PyCitySchools - Analysis between budget and academic performance in schools

budget-analysis data-analysis jupiter-notebook pandas-dataframe python school-performances

Last synced: 17 May 2026

https://github.com/aliciagilmatute/analisis-multinivel-bayesiano

Este estudio explora el análisis multinivel desde un enfoque bayesiano para evaluar la variabilidad del rendimiento en matemáticas entre 10 centros educativos

bayesian-statistics cmdstanr data-analysis hierarchical-models multilevel-models rstats rstudio stan

Last synced: 30 Oct 2025

https://github.com/kumaranand05/suicide-rate-analysis

Analysis of Mortality data of WHO and visualization using Power BI

analytics data-analysis data-visualization mortality-rates powerbi python suicide-dataset suicide-rate

Last synced: 04 May 2026

https://github.com/thecoderpinar/telecommunication-customer-churn-analysis-and-prediction

📊 This project focuses on customer churn analysis and prediction in the telecommunications sector. Using data analysis, modeling, and predictive techniques, it aims to understand and mitigate customer loss by developing strategies.

churn churn-prediction classification customer data-analysis data-science deep-learning machine-learning neural-network telecom

Last synced: 07 Aug 2025

https://github.com/msthamizh/airbnb_analysis

Developing a Streamlit application enabling users to explore and analyze Airbnb listing data. This application allows users to interactively visualize geospatial distributions of listings, analyze pricing trends, and explore availability patterns across different locations. Integrates MongoDB Atlas for data storage and PowerBi for advanced insights

data-analysis data-cleaning data-visualization json mongodb pandas-dataframe plotly powerbi python streamlit

Last synced: 11 Apr 2026

https://github.com/shridhar1504/foreign-exchange-rate-time-series-datascience-project

This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.

data-analysis data-preprocessing data-science data-transformation data-visualization eda exploratory-data-analysis foreign-exchange-rates machine-learning model-fitting predictive-modeling python3 time-series time-series-analysis

Last synced: 17 May 2026

https://github.com/al-ghaly/power-bi-dashboard

A dashboard to analyze data specializations job market.

dashboard data-analysis powerbi

Last synced: 02 Feb 2026

https://github.com/khuyentran1401/sample_datapane_script

This repo shows how to use Datapane create a simple script to see the rank of the authors or publications with respect to publishing frequency

data-analysis data-science datapane python

Last synced: 21 May 2026

https://github.com/JovaniPink/excel-powerbi

The folder of my work with Excel, VBA, and PowerBI for Data Analysis & Visualization.

data-analysis data-visualization dax excel excel-vba power-pivot power-query powerbi vba-macros

Last synced: 20 Jul 2025

https://github.com/denko5/sales-analysis

A complete SQL-based sales analysis project covering Africa, showcasing data cleaning, exploratory analysis, insights, and lessons learned. The project highlights sales trends, regional performances, and marketing effectiveness across multiple platforms.

africa data data-analysis data-science exploratory-data-analysis insights kenya sales sql

Last synced: 24 Jan 2026

https://github.com/ghackenberg/kurs-datenanalyse

This repository contains material for my data analysis course. In this course we first introduce the concept of databases and SQL, before diving into OLAP and other data analysis tools.

data-analysis data-structures data-warehouse entity-relationship-diagram etl graph list olap relational-algebra relational-database sql tree

Last synced: 17 Feb 2026

https://github.com/priyanshubiswas-tech/ev-data-analysis-dashboard

An interactive dashboard analyzing EV trends, including total vehicles, BEV vs. PHEV breakdown, model popularity, state-wise distribution, and CAFV eligibility. Visualizes key insights for data-driven decisions in the EV industry. 📊

dashboard data data-analysis data-science data-visualization tableau tableau-public

Last synced: 17 Feb 2026

https://github.com/morphclue/godot-trend

R-Code and data for game engines on itch.io

data-analysis game-engines trends

Last synced: 05 Apr 2025

https://github.com/olekscode/covidanalysis

A setup for COVID-19 data analysis in Pharo

coronavirus covid-19 data-analysis pharo

Last synced: 05 Apr 2025

https://github.com/yash-kavaiya/ai-analytics

This is a Streamlit app that uses Pandas and AI to perform data analytics on uploaded CSV files.

data-analysis generative-ai pandas streamlit

Last synced: 20 Jul 2025

https://github.com/defrecord/value-alignment-toolkit

A comprehensive toolkit for implementing, analyzing, and validating AI value alignment based on Anthropic's 'Values in the Wild' research.

ai anthropic data-analysis ethics privacy python simulation value-alignment

Last synced: 20 Jul 2025

https://github.com/nafisalawalidris/logistic-regression-model-for-breast-cancer-recurrence-prediction

Predicting Breast Cancer Recurrence - A logistic regression model using patient attributes to classify recurrence risk. Dataset analysis and model evaluation. Contributions welcome.

breast-cancer classification-model data-analysis data-science healthcare logistic-regression machine-learning python recurrence-prediction scikit-learn

Last synced: 17 May 2026

https://github.com/nysportsfan/Gun-Violence-in-the-US

This repository contains all the relevant files for my first capstone project as part of the Springboard Data Science Career Track.

data-analysis data-science data-visualization machine-learning python3 statistics

Last synced: 10 May 2025

https://github.com/souravsuvarna/whatsapp-chat-analyzer-api

The WhatsApp Chat Analyzer API is a public api specifically designed for frontend enthusiasts who are interested in building a WhatsApp Chat Data Visualizer project. Built on FastAPI, this API offers a seamless and efficient method to process chat data and returns the processed result data in JSON format.

api data-analysis data-science fastapi publicapi python

Last synced: 23 Feb 2025

https://github.com/vkbo/osirisanalysis

Matlab toolbox for analysing simulation results from Osiris 3

data-analysis matlab matlab-gui physics-simulation

Last synced: 10 May 2025

https://github.com/ibensusan/wine-properties-assessment

Wine Properties Assessment using Microsoft Excel

data-analysis data-visualization excel

Last synced: 20 Mar 2026

https://github.com/ncasuk/decades-pp

Post processing library for the data from the FAAM aircraft

atmospheric-sciences data-analysis data-processing meteorology science

Last synced: 07 Mar 2026

https://github.com/sharathsphd/coffee_causality

Data-driven analysis of coffee shop sales using correlation, regression, and causal inference. A Jupyter Book project exploring foot traffic, weather patterns, and business analytics.

business-analytics causal-inference correlation data-analysis foot-traffic forecasting github-pages jupyter-notebook machine-learning open-source python regression retail-analytics statistics storytelling time-series visualization weather-analysis

Last synced: 18 May 2026

https://github.com/fbraza/paris_airbnb

Analysis of Paris AirBnB data using R and Shiny

analysis data data-analysis paris-airbnb r shiny

Last synced: 21 Mar 2025

https://github.com/mariam-badr-mb/gtc-land-type-classification

This project develops a machine learning model to classify land cover types in Egypt using Sentinel-2 satellite imagery. The system detects categories such as agriculture, water bodies, urban areas, deserts, roads, and tree cover.

data-analysis data-visualization deep-neural-networks eda machine-learning model-architecture streamlit

Last synced: 12 Jun 2026

https://github.com/devandrenicolas/analise-de-vendas

This project is a comprehensive data analysis tool designed to analyze sales performance data. It includes modules for generating fake sales data, cleaning and preprocessing the data, and performing exploratory data analysis (EDA) with advanced visualizations.

data-analysis data-visualization faker-generator matplotlib pandas python

Last synced: 07 May 2026

https://github.com/adityakumarsingh01/customer-purchase-behaviour-analysis

A data analysis project exploring online consumer behavior and FOMO effects using EDA on survey data.

consumer-behavior data-analysis eda fomo online-shopping python survey-data

Last synced: 25 Apr 2026

https://github.com/ehopperdietzel/billionaires-analysis

Análisis de la cantidad de billonarios por país. Inspirado en el artículo "Russian Billionaires"

bootstrap data-analysis poisson-distribution prediction

Last synced: 18 May 2026

https://github.com/idhs-song/resume-matcher-agent-cn

🤖 Enhance your job applications with this AI-driven resume matcher that analyzes job descriptions to optimize your resume for better chances of success.

api-integration automation backend-development data-analysis data-visualization github-actions job-search machine-learning natural-language-processing open-source-tools python recommendation-system resume-matching user-interface web-app

Last synced: 18 May 2026

https://github.com/agustinmusanti/sqlchallenge-4

Desafio de creación de una base de datos SQL para una plataforma de streaming. Incluye DDL, DML y consultas avanzadas.

data-analysis database mysql sql streaming

Last synced: 18 May 2026

https://github.com/danhenriquex/final-project-ia

Artificial Intelligence Project - Analysis of sentiments of news that impact the value of shares.

data-analysis machine-learning supervised-learning

Last synced: 25 Jun 2025

https://github.com/aiswarya196/supply-chain-analytics-ai

End-to-End Supply Chain Analytics project using AI tools (n8n, Quadratic) to automate data ingestion, calculate KPIs, and generate business insights.

data-analysis n8n-automation postgres quadratics supabase supply-chain-analytics

Last synced: 18 May 2026

https://github.com/shubhamgoyal575/ecommerce-product-categorization

This project classifies e-commerce products into predefined categories using machine learning. It includes preprocessing steps like stopword removal, punctuation cleaning, and feature extraction. Models, including LSTM, are implemented, and evaluated for better accuracy.

accuracy-score artificial-neural-networks confusion-matrix data-analysis data-cleaning data-preprocessing data-science data-visualization deep-learning exploratory-data-analysis hyperparameter-tuning logistic-regression long-short-term-memory machine-learning machine-learning-algorithms naive-bayes-algorithm natural-language-processing precision-score random-forest-classifier

Last synced: 30 Aug 2025

https://github.com/shubhamgoyal575/spam_detective

This project uses machine learning to classify messages as spam or ham based on text analysis. It includes data preprocessing, feature extraction (TF-IDF), and classification models like Logistic Regression and Naive Bayes for accurate spam detection. Built with Python and Scikit-Learn. 🚀

count-vectorizer data-analysis data-analytics data-cleaning data-preprocessing data-science data-visualization data-wrangling exploratory-data-analysis logistic-regression machine-learning machine-learning-algorithms naive-bayes natural-language-processing spam-detection tfidf-vectorizer

Last synced: 02 Jul 2025

https://github.com/dcs-training/interactive-analysis-reports-with-r-markdown.github.io

This workshop will help you create your own reproducible, customisable, and interactive analysis reports through R Markdown. By building on the basics of R, we will show you how to instantly prepare your results into a ready-made document (No more copy and pasting your results! Less human error!). Go to the readme file

data-analysis data-visualisation data-wrangling r rmarkdown statistics

Last synced: 25 Jun 2025

https://github.com/haloapping/pisangijo

Kumpulan library dan framework untuk analisa data, data science, machine learning, deep learning dan masih banyak lagi berbasis bahasa pemrograman Python 🐍.

belajar data-analysis data-science deep-learning forecasting libraries machine-learning perkakas pustaka python3 recommender-system referensi tools

Last synced: 13 Jun 2026

https://github.com/vara-co/sql-challenge

EmployeeSQL "Data modeling, data engineering, and data analysis."

data-analysis data-engineering data-modeling databases employee-database erd erdiagram postgres postgresql schema sql tables

Last synced: 18 May 2026

https://github.com/richardwarepam16/rental_analysis_using_python_and_sql

Maximizing Rental Profits: Data-Driven Strategies for a Movie Rental Store

data-analysis data-analytics python3 rental-management sakila-db sqlite3

Last synced: 18 May 2026

https://github.com/djo/data-analysis

Data Analysis course notebooks in R

data-analysis r

Last synced: 29 Mar 2025

https://github.com/kiranmayi5/r-projects

This repository showcases R projects designed to tackle real-world problems through data-driven solutions.

data-analysis exploratory-data-analysis predictive-modeling r statistical-analysis

Last synced: 25 Jun 2025

https://github.com/vara-co/space-missions

Space Missions Over Time (1957-2022): Successes vs Failures, and Rocket Usage

data-analysis data-analysis-python history matplotlib pandas pandas-python space space-race spaceships team-project

Last synced: 18 May 2026

https://github.com/gurpreetkaurjethra/ai-data-visualization-agent

This Streamlit application creates an interactive Data Visualization Assistant that can understand Natural Language Queries and generate appropriate Visualizations using LLMs.

aiagents aichatbot aidevelopment artificial-intelligence data-analysis data-visualization generative-ai llms

Last synced: 25 Jun 2025

https://github.com/ajwad-shaikh/sristi-sanshodh-collect

SRISTI Sanshodh Collect is an Android app for filling out forms. It's been used to collect billions of data points in challenging environments. Contribute and make the world a better place! ✨📋✨ https://docs.opendatakit.org/collect-…

collect data-analysis data-collection javarosa odk opendatakit

Last synced: 04 Apr 2025

https://github.com/p2-718na/alice-simulation

Code for my Lab-2 course.

cern-root data-analysis

Last synced: 13 Mar 2025

https://github.com/sufiyanahmed4566/sql-musicmaven

"This Music Store Database Project showcases SQL skills through comprehensive database design, query optimization, and data analysis. Includes ER diagram, database file, query questions (Easy, Medium, Hard), answered queries, and CSV table data. Ideal for recruiters seeking skilled SQL developers for music store management and data analysis.

data-analysis database insights mysql-database oracle-database relational-databases sql

Last synced: 18 May 2026

https://github.com/makosai/covid19datachart

A basic chart for checking corona data. Written in a single HTML file for convenience. Grab the single file and run it anywhere. Or visit the webpage.

chart chartjs corona coronavirus coronavirus-analysis covid-19 covid-2019 covid19 covid19-data data data-analysis datasets

Last synced: 23 Feb 2026

https://github.com/onome-joseph/ml-fraud-dectection

This project is designed to identify fraudulent transactions with high accuracy.

classfication-model data-analysis data-science machine-learning problem-solving

Last synced: 06 Apr 2025

https://github.com/steciuk/ium-recommendation-system

Evaluation and comparison of 3 different recommendations models for web shopping service simulation.

data-analysis model-evaluation recomendation-system

Last synced: 29 Oct 2025

https://github.com/ebowwa/chatgpt-export-processor

🤖 Extract, analyze & search your ChatGPT conversations locally | Privacy-first tool for OpenAI ChatGPT data export processing | Python CLI with embeddings support

ai-tools chatgpt chatgpt-export chatgpt-tools cli conversation-analysis data-analysis data-extraction embeddings local-first nlp openai openai-api privacy python

Last synced: 19 May 2026

https://github.com/salman-khan-mohammed/predicting-the-intent-of-online-shoppers

This project aims to predict online shoppers' purchase intentions using browsing history and user data from e-commerce sites. By analyzing clickstream and session information, the goal is to create a machine learning model that accurately forecasts customers' likelihood of making a purchase.

cluster-analysis data-analysis data-pre eda outliers prediction

Last synced: 31 Oct 2025

https://github.com/adolbyb/data-science-python

An Introduction to Data Science and Data Visualization with the FAU Data Science and Machine Learning Club

data-analysis data-science data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/fatihilhan42/web_scraping_football_statistics_per_game_data-main

In this notebook I will describe the process of scraping data from web portal understat.com that has a lot of statistical information about all games in top 5 European football leagues.

data-analysis data-manipulation data-science data-scraping data-visualization jupyter-notebook python

Last synced: 19 May 2026

https://github.com/subhojit45/python3-iphones-x-flipkart-sales-analysis

A simple six questions and their insights derived from iphone sales on Flipkart dataset.

data-analysis jupyter-notebook python3 visual-studio-code visualization

Last synced: 19 May 2026

https://github.com/geobatpo07/office-hours-bootcamp

Practical case studies and labs from the Akademi 2025 Data Science & AI Bootcamp office hours.

artificial-intelligence data-analysis data-science data-visualization database deep-learning learning learning-by-doing machine-learning statistics

Last synced: 07 Mar 2026

https://github.com/airdac/mva-bank_india

Multivariate Analysis of an Indian bank's dataset about loan paybacks in R. Team project from UPC's Master's Degree in Data Science

data-analysis data-science multivariate-analysis r upc

Last synced: 26 May 2026

https://github.com/2003harsh/house-price-prediction-using-machine-learning

This project features a web app that predicts house prices using a linear regression model. Users can input details like location, square footage, bathrooms, and bedrooms through an HTML form. I've added a CI/CD pipeline with GitHub Actions, unit testing with pytest, and automated Docker containerization to improve deployment and robustness.

ci-cd data-analysis docker-image flask linear-regression machine-learning matplotlib mlops-workflow requests scikit-learn

Last synced: 04 Jan 2026

https://github.com/tiwarishubham635/uber-data-analysis-using-r

Analyzes the Uber Cab data using plots, heatmaps and dataframes

data-analysis data-visualization r

Last synced: 14 Apr 2025

https://github.com/aldomann/tropical-cyclones

Scripts to replicate the analyses and figures from "Scaling of tropical-cyclone dissipation" by Corral et al.

bachelor-thesis data-analysis hurricanes mathematical-statistics ocean studies

Last synced: 19 May 2026