An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/shsiddhant/memory.fm

A Python library, CLI tool, and web-based dashboard for exploring music listening history from Last.fm and Spotify.

analytics data-analysis data-visualization memories music

Last synced: 04 Apr 2026

https://github.com/dcs-training/intromachinelearning

This course is aimed at providing an introduction to machine learning for those with some beginner level python skills. Go to the readme file

data-analysis data-wrangling machine-learning python statistics

Last synced: 06 Mar 2026

https://github.com/vara-co/python-api-challenge

Weather and Perfect Vacationing Spots Worldwide, by using APIs

api apis data-analysis data-visualization hvplot jupyter-notebook matplotlib pandas vacation weather

Last synced: 05 May 2026

https://github.com/idaraabasiudoh/knn-customer-classification

Labels telecommunication customer base to respective groups to determine service type required for each customer.

data-analysis jupyter-notebook machine-learning pyhton3 scikit-learn

Last synced: 07 May 2026

https://github.com/dangerousfish/uk-climate-trends-dashboard-metoffice

A data pipeline and Streamlit dashboard that aggregates, cleans and visualises historical UK Met Office station data - interactive charts, heatmaps and maps for temperature, rainfall and sunshine.

climate climate-analysis climate-change climate-data climate-science data-analysis data-visualization metoffice metofficeweather streamlit temperature weather

Last synced: 02 May 2026

https://github.com/backdoorali/insider-threat-detection-project

Personal data analysis project combining insider threat detection, cybersecurity, and exploratory data analytics. Built for portfolio showcase and practical skills demonstration.

cybersecurity data-analysis data-analysis-excel data-analysis-project data-analyst data-analytics data-visualization eda excel insider-threat jupyter-lab jupyter-notebook matplotlib numbers pandas portfolio-project python python3 threat-detection threat-intelligence

Last synced: 07 May 2026

https://github.com/gorodroz/crypto-tracker

Realtime Bitcoin price tracker using Binance WebSocket and REST API. Logs prices to CSV and supports Pandas for data analysis.

binance bitcoin crypto csv-logger data-analysis pandas python rest-api websocket

Last synced: 07 May 2026

https://github.com/bala-1409/foreign-exchange-rate-time-series-data-science-project

This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.

data-analysis data-science data-visualization datapreprocessing eda exploratory-data-analysis forecasting machine-learning-algorithms model modelfitting predictive-modeling python3 scikit-learn statsmodels time-series time-series-analysis

Last synced: 07 May 2026

https://github.com/md-emon-hasan/data-science

Data science tutorials, including data preprocessing, analysis, visualization, project deployment, machine learning and deep learning algorithms.

artificial-intelligence data-analysis data-engineering data-science deep-learning machine-learning-algorithms python

Last synced: 07 May 2026

https://github.com/nakshjainsonigara/vba-canteenmanagementsystem

The Canteen Management System is a comprehensive software solution designed to modernize and optimize canteen operations. It aims to simplify the complexities of managing a canteen by automating key processes such as order management, payment processing, and report generation.

canteen canteen-mangement-system charts data-analysis email excel microsoft payment-gateway vba vba-excel vba-macros word

Last synced: 30 Jan 2026

https://github.com/y-india/retail-sales-analysis-project

Analysis and preprocessing of retail store sales data. Includes data loading, merging, and initial inspection. 📌 Recommended: See README.md for detailed project progress and dataset information.

ai dashboard data-analysis data-science data-visualization jupiter-notebook machine-learning matplotlib python real-world-problem-solving real-world-project retail-analytics sales-analysis seaborn sklearn-library streamlit

Last synced: 07 May 2026

https://github.com/kaushik0911/jubilant-guide

A Streamlit application for advanced route planning and accessibility analysis using OpenRouteService (ORS). Explore optimal routes while avoiding roadblocks and discover points of interest (POIs) within travel time ranges.

data-analysis data-visualization geospatial-analysis python streamlit

Last synced: 16 Jun 2026

https://github.com/1ayanabil1/iris-visualization

This repository focuses on visualizing the Iris dataset using various data visualization techniques. It includes histograms, scatter plots, box plots, pie charts, bubble charts, and KDE plots to provide insights into the dataset’s structure. The project utilizes Matplotlib, Seaborn, Plotly, and Scikit-learn to generate insightful visualizations.

analytics clustering data-analysis data-science data-visualization datavisualization-project datavisualizations eda exploratory-data-analysis machine-learning machinelearning-python python

Last synced: 07 May 2026

https://github.com/mg380/ibm-applied-data-science-capstone

This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarises in the form of project all materials that have been learned during this specialization

capstone data data-analysis data-science datascience ibm machine-learning plotly python scikit-learn sql

Last synced: 05 Mar 2026

https://github.com/alfikiafan/air-quality-analysis

This repository contains a comprehensive data analysis project on Air Quality Dataset, covering the complete data analysis process from data gathering, cleaning, exploratory data analysis (EDA), to building a fully interactive dashboard using Streamlit.

air-quality data-analysis dicoding

Last synced: 17 Apr 2026

https://github.com/obirikan/ad-performance-analysis

This project Compares Ad Effectiveness Using A/B Tests; analyzes ad performance using user interaction data, advertisement metadata, and device data. The goal is to evaluate click-through rates (CTR) across various ad versions, platforms, and devices.

data-analysis pandas

Last synced: 27 Apr 2026

https://github.com/kimtth/agent-data-analyst-stream-chainlit

⚡️Chainlit-based Data Analyst Chat Agent (Responses API, Server Sent Events) 📈

agent azure-openai chainlit code-interpreter data-analysis server-sent-events stream-response

Last synced: 09 Jun 2026

https://github.com/bassamn/titanic-data-analysis

Exploratory data analysis (EDA) of the Titanic dataset using Python. Analyzed survival patterns by age, gender, and class with visualizations (seaborn/matplotlib). Non-ML focus—highlighting insights with statistics and plots.

data-analysis eda pandas python seaborn titanic visualization

Last synced: 08 May 2026

https://github.com/adnanrahin/apache-spark-complete-reference

This repository reflects on all the necessary steps to take before jump in into Big Data.

big-data data-analysis data-science kaggle-dataset machine-learning rdd scala spark

Last synced: 29 Apr 2026

https://github.com/scarblase/sales_insights

A data-driven analysis of 15,000 sales records using Python, Pandas, and visualizations to uncover trends, optimize strategies, and enhance business performance. 🚀📊

data-analysis data-visualization dataset matplotlib-pyplot pandas python3 sales-analysis seaborn

Last synced: 05 May 2026

https://github.com/manwithacap/by-the-metric-match

🎲🃏 A game data tracker for your board/card/video games!

data-analysis data-visualization games jupyter-notebook python utility

Last synced: 29 Apr 2026

https://github.com/faris771/investigate_a_dataset

This repository contains a Jupyter Notebook that investigates a dataset using data analysis techniques.

data-analysis

Last synced: 29 Apr 2026

https://github.com/akash1070/project---applied-statistics-

To dive deep into this data & find some valuable insights.

data-analysis data-science python statistics

Last synced: 30 Apr 2026

https://github.com/nikhilash45/power-bi-vsualisation-of-joins

In This Power Bi Report User Can Visualis Join By Themselves , and it is easy to understand joins now.

business-analytics business-intelligence data data-analysis data-visualization joins powerbi sql visualization

Last synced: 19 Mar 2026

https://github.com/cdilga/knn-c

C implementation of a K-Nearest Neighbour algorithm

data-analysis knn

Last synced: 04 Apr 2026

https://github.com/geo-y20/loan-approval-automation-using-mongodb-and-pymongo

This project demonstrates the implementation of a loan approval system that utilizes MongoDB for distributed data storage and management, and PyMongo for database operations. The project aims to automate the assessment of loan eligibility using customer details from online applications.

crud-application data data-analysis data-science data-visualization deployment jupyter-notebook loan-default-prediction loan-prediction-analysis machine-learning machine-learning-algorithms matplotlib mongodb pymongo streamlit web

Last synced: 08 May 2026

https://github.com/timmymatten/spikeball-stat-tracker

Spikeball stat tracking web app built with Streamlit and Python, designed to easily log and analyze player performance over multiple games.

data data-analysis data-visualization dataset matplotlib-pyplot multipage python spikeball statistics streamlit

Last synced: 18 Apr 2026

https://github.com/ralstonraphael/ralston_ethan_datastory

Mapping Change: Advancing Housing Equity and Stability in Connecticut's Communities

census-data data-analysis data-visualization datawrapper google-sheets housing html json

Last synced: 09 Feb 2026

https://github.com/jhrcook/wagenmaker-data-analysis

Analysis of Registered Replication Report: Strack, Martin, & Stepper (1988) by Wagenmaker et al.

data-analysis r r-project statistics

Last synced: 08 Jun 2026

https://github.com/adrija-debnath/ideas-isi-data-science-internship

Topic of the Project - Predictive Maintenance Analysis, Data Science Internship at IDEAS - Institute of Data Engineering, Analytics and Science Foundation Technology Innovation Hub at Indian Statistical Institute, Kolkata.

data-analysis data-science predictive-analytics predictive-maintenance streamlit

Last synced: 27 Apr 2026

https://github.com/sunnybibyan/marketing_campaign_analysis_power_bi_dashboard

Campaign Performance Analysis This project analyzes the performance of Spring, Summer, and Fall marketing campaigns, revealing key insights and actionable recommendations.

data-analysis data-visualization dax marketing-campaign powerbi

Last synced: 19 Mar 2026

https://github.com/shridhar1504/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization datanalysis eda kmeans-clustering machine-learning python sql sql-server unsupervised-learning

Last synced: 08 May 2026

https://github.com/nurfakhri/e-commerce-data-analyst

E-commerce data analysis supported by data wrangling, EDA, and web dashboard

dashboard data-analysis e-commerce flask-application python

Last synced: 10 Feb 2026

https://github.com/allanotieno254/road-accident-data-analysis-dashboard-using-excel

This repository contains the Road Accident Data Analysis Dashboard, a comprehensive Excel-based tool designed to provide in-depth analysis and visualization of road accident data.

dashboards-excel data-analysis excel kpi visualization

Last synced: 19 Mar 2026

https://github.com/ehtisham-sadiq/building-an-ml-based-heart-disease-diagnosis-system-with-flask

It is an end-to-end project that combines machine learning to create a user-friendly Heart Disease Diagnosis System, powered by Flask.

data-analysis exploratory-data-analysis feature-engineering flask machine-learning model-building model-evaluation pipelines python3 rest-api

Last synced: 04 May 2026

https://github.com/varshithdupati/yelp-business-analysis

Big Data analysis on Yelp reviews/businesses for Arizona. Using Hadoop, Spark, PySpark.

arizona-state-university big-data big-data-analytics data-analysis hadoop pyspark spark yelp

Last synced: 04 May 2026

https://github.com/allanotieno254/employee-performance-tracker-excel-

An Excel-based tool to track and evaluate employee performance, compliance, and skills assessments with summary statistics and visual charts

compliance-tracker data-analysis employee-performance-analysis excel human-resources

Last synced: 19 Mar 2026

https://github.com/iguptashubham/ott-churn-eda-ml

Understanding why customers discontinue their subscriptions will be crucial in optimizing the user experience, reducing churn, and maximizing customer lifetime value. By using Machine learning model to predict the Customer Churn.

data-analysis data-analysis-project data-science data-science-portfolio data-science-projects data-visualization machine-learning python

Last synced: 08 May 2026

https://github.com/jethronap/jstat-gui

Web-based GUI application for data analysis

data-analysis data-visualization java jstat mongodb

Last synced: 08 May 2026

https://github.com/hayatiyrtgl/data_analysis_project

Financial data analysis: preprocess, visualize, calculate technical indicators.

data-analysis data-analysis-python data-science dataframe numpy pandas python python3 stock-price-prediction talib trade-analysis

Last synced: 04 Apr 2026

https://github.com/mikasenghaas/covid19-analysis

analysis of correlation between covid-19 infection numbers and weather data from the beginning of the pandemic until april 2021

data-analysis statistical-analysis

Last synced: 14 Feb 2026

https://github.com/maugus0/sats-flight-data-fetcher

A simple Python tool to fetch and analyze flight data for 15+ major airlines using the AirLabs API.

airline-data cli-tool data-analysis flight-data python3

Last synced: 17 Mar 2026

https://github.com/wewoc/garmin_local_archive

Secure, local-first archive for Garmin Connect health data (HRV, sleep, activities). Private & offline. Structured for local analysis (Excel, HTML-Dashboard, Ollama, Open WebUI, AnythingLLM). Your data stays on your machine.

backup dashboard data-analysis fitness-tracker garmin garmin-connect ollama open-webui privacy privacy-enhancing-technologies privacy-first privacy-focused python self-hosted

Last synced: 16 Apr 2026

https://github.com/miroslav-reiter/kurz_jazyk_sql_analytici_datovi_vedci

Materiály ku kurzu Jazyk SQL 1 pre Analytikov a Dátových Vedcov

analysis analytics data data-analysis data-science database mysql reiter sql

Last synced: 08 May 2026

https://github.com/cagandemirmr/google-play-yorum-analizi

Türkiyede 2024 yılında en çok beğenilen My Supermarket Simulator 3D oyununa ait yorumların duygu durumu,yorumların beğeni sayısını,Firmanın geri dönüşleri ve kullanıcı nicknameleri gibi değişkenleri analiz ederek içgörü topladım.

bert data-analysis data-science nlp

Last synced: 10 Jun 2026

https://github.com/allanotieno254/us-largest-companies-by-revenue-web-scraping

A Python project for web scraping and analyzing the largest companies in the United States by revenue from Wikipedia

automation beautifulsoup csv data-analysis data-cleaning data-execution data-extraction pandas python web-scraping

Last synced: 08 May 2026

https://github.com/framebuffers/mindhunter

Wrappers for Pandas DataFrames to add quicker access for common statistical values, utilities and functionality.

data-analysis data-science numpy pandas python utilities-python

Last synced: 08 May 2026

https://github.com/aekanshd/crazytics-suicidesindia

Basic interpretation of the Suicides in India data-set using R.

data-analysis data-science graph india r suicides

Last synced: 10 Jun 2026

https://github.com/md-emon-hasan/data_analytics_project

Data analytics tasks and solutions, featuring hands-on exercises for data cleaning, visualization, and analysis using Python libraries.

cars-dataset census-data covid19-data data-analysis london-house-price police-data weather-data

Last synced: 08 May 2026

https://github.com/fer-aguirre/taller-cookiecutter

Taller sobre cómo usar Cookiecutter para análisis de datos.

cookiecutter data-analysis project-template workshop

Last synced: 19 Mar 2026

https://github.com/ryanfranklin237/data-cleansing

A group of python scripts that clean large data sets by removing duplicate data, putting data in correct formats, and removing redundant cells

data-analysis data-cleaning data-science extract-transform-load pandas-dataframe python

Last synced: 23 Jun 2026

https://github.com/akarshankapoor7/tensorflow_tutorial

This is an easy and fast tutorial for tensorflow. In data science, TensorFlow is an open-source machine learning framework by Google. It's used for building and training machine learning and deep learning models.

data-analysis data-science deep-learning machine-learning tensorflow

Last synced: 27 Apr 2026

https://github.com/sarthak-0-sach/drivermasterdata_database_table

This code enables data integration from multiple sources and ensures a single source for all driver-related attributes. Designed for scalability and pipeline compatibility, this project supports clean data transformations, validations, and storage-ready outputs. Ideal for quick analytics, created using python & airflow, automated using cronjob.

apache-airflow-etl-pipeline data-analysis data-visualization database-management python

Last synced: 27 Apr 2026

https://github.com/daniel1kp/openrtb-dashboard

This is a demo project designed to illustrate using Rill to analyze programmatic bid logs using the canonical open RTB framework.

data-analysis openrtb real-time-bidding rill

Last synced: 19 Mar 2026

https://github.com/seankwarren/water-quality-analysis

An examination of water quality in the Atlanta watershed with a focus on identifying neglected areas and potential strategies for improving water quality monitoring

analytics data-analysis jupyter-notebook python

Last synced: 03 May 2026

https://github.com/angelgardt/wlm-sdarp-old

World of Linear Models: Statistics & Data Analysis in R for Psychologists

data-analysis data-visualization gh-pages manim-animations quarto r rstudio statistics

Last synced: 04 May 2026

https://github.com/datalopes1/ds_salaries2024_eda

Neste projeto será realizado o processo de EDA (Exploratory Data Analysis) a partir do dataset Data Science Salaries 2024, que pode ser encontrado no Kaggle, com licensa Database: Open Database e enviado por Sazidul Islam.

data-analysis data-visualization eda exploratory-data-analysis jupyter-notebook python

Last synced: 29 Apr 2026

https://github.com/gowthamsundaresan/eigenscan

blockexplorer for eigenlayer

crypto data-analysis eigenlayer nextjs web3

Last synced: 04 May 2026

https://github.com/patriloto/intro_r_para_reinventartec_2021

Material del taller Primeros pasos en R para el análisis de datos

data-analysis rstats

Last synced: 12 Feb 2026

https://github.com/dina-hosny/telco-customer-churn-analysis-using-power-bi

An interactive dashboard to represent some analysis of "Telco customer churn" data and the reasons that made customers churn using Microsoft Power BI.

business-intelligence data-analysis data-modeling data-visualization power-bi powerbi

Last synced: 19 Mar 2026

https://github.com/denisecase/nlp-03-text-exploration

Exploratory analysis of text corpora using tokenization, frequency, co-occurrence, and bigrams to reveal structure in text.

bigrams co-occurence corpus-analysis data-analysis nlp python text-analysis text-exploration tokenization

Last synced: 02 Jun 2026

https://github.com/phillbertnevinemmanuel/dataprofessionalquestionnaireanalysis_pbix

This project delves into the demographics and job satisfaction of data professionals, presenting insights through a user-friendly dashboard built with Power BI. The dataset has been graciously provided by Alex The Analyst

dashboard data-analysis powerbi visualization

Last synced: 19 Mar 2026

https://github.com/flexmonster/svelte-flexmonster

Svelte wrapper for Flexmonster Pivot Table & Charts

data-analysis data-visualization frontend pivot-tables svelte sveltekit

Last synced: 27 Feb 2026

https://github.com/rogernet/desafio-profissional-produto-data-driven

Ajudar a formar Analistas de Produto, PMs e Gestores de Negócio capazes de tomar decisões estratégicas baseadas em dados.

data-analysis data-science data-visualization product

Last synced: 23 Jun 2026

https://github.com/victor-antoniassi/junior_data_analyst_test_01

Solution developed for a technical assessment that analyzed video game sales data to support gaming partnership decisions.

asses assessment-project data-analysis data-analysis-project data-analyst duckdb etl prefect python

Last synced: 01 Jun 2026

https://github.com/sleeplessglory/big-data

Projects regarding big data analysis, presented within Jupyter Notebook

big-data data-analysis data-visualization jupyter python

Last synced: 16 Apr 2026

https://github.com/ahmad-ali-rafique/handwritten-digit-recognition-mnist

This project demonstrates a complete pipeline for recognizing handwritten digits using the MNIST dataset. The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation of a Fully Connected Neural Network (FCNN).

ai artificial-intelligence data data-analysis datascience deep-learning deep-neural-networks fcnn fully-connected-network machine-learning machine-learning-algorithms ml modeling

Last synced: 09 Jun 2026

https://github.com/mindgamesnl/yanderestats

https://mindgamesnl.github.io/YandereStats/

data-analysis reporting-pipeline yandere yandere-sim

Last synced: 18 Jun 2026

https://github.com/akshat0427/python_youtube_history

a bunch of data science operations performed on youtube history data

data-analysis data-science extracting-features

Last synced: 10 Jun 2026

https://github.com/msthamizh/phonepe-pulse-data-visualization-and-exploration

Developing a Streamlit application that allows users to explore and analyze transaction data from the PhonePe Pulse dataset. The project aims to provide insights into digital payment trends across India.

data-analysis data-visualization dataframe mysql pandas plotly python streamlit

Last synced: 02 May 2026

https://github.com/avijit-jana/redbus-data-scraper-dashboard

A Streamlit-based application leveraging Selenium to automate data scraping from Redbus, enabling efficient collection, analysis, and visualization of bus travel data for improved operational efficiency and strategic planning in the transportation industry.

automation dashboard data-analysis data-visualisation data-visualization datadrivendecisions filtering python3 redbus selenium selenium-python streamlit streamlit-application travel web-scraping webscrapping

Last synced: 09 May 2026

https://github.com/sedatdikbas/aefes-time-series-forecasting

Bu proje, Anadolu Efes Biracılık ve Malt Sanayii A.Ş. (AEFES) piyasa verilerini kullanarak kapanış fiyatlarının gelecekteki değerlerini tahmin etmek amacıyla derin öğrenme yöntemleri (LSTM, BiLSTM, CNN+LSTM) kullanmaktadır. Projede, veri ön işleme, model eğitimi ve değerlendirme adımları detaylandırılmıştır.

bilstm cnn-lstm data-analysis deep-learning financial-forecasting lstm machine-learning python stock-price-prediction tensorflow

Last synced: 09 May 2026

https://github.com/ruchit0807/heart_disease_prediction

An interactive ML-powered web app that predicts the risk of heart disease based on clinical inputs like age, chest pain, cholesterol, ECG, and more. Built using Python, Streamlit, and scikit-learn, it offers early risk assessment in a simple and accessible way—just enter your health metrics and get instant feedback.

data-analysis data-science knn-regression pandas streamlit

Last synced: 04 May 2026

https://github.com/rubinlake/rl-academy-data-analytics

Educational data analysis project demonstrating BMW sales data analysis with AI-powered code assistance using Cursor IDE and Jupyter notebooks

cursor-ide data-analysis educational-project jupyter langchain matplotlib numpy pandas python scipy seaborn

Last synced: 09 May 2026

https://github.com/vasishth/lecturesintrobayes

Please go to the website for these online lectures:

bayesian-inference brms data-analysis stan

Last synced: 06 Feb 2026

https://github.com/titanscouting/tra-analysis

Titan Robotics 2022 Strategy Team Analysis Repository

data-analysis frc frc-scouting hacktoberfest python

Last synced: 29 Jan 2026

https://github.com/archie-cm/credit_risk_model_vix_id-x_partners

The objective project is to decrease the company's losses by up to 30% through bad loans by creating a machine learning system to assist in automating loan assessments

credit-risk data-analysis data-visualization machine-learning scorecard

Last synced: 01 May 2026

https://github.com/dina-hosny/explore-us-bike-share-data-project

Explore US Bike Share Data project - FWD Data Analysis Professional Track. In this project, I used Python to explore data related to bike share systems for three major cities in the United States and answer questions about it by computing descriptive statistics.

data-analysis data-science numpy pandas python

Last synced: 09 May 2026