An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/com-480-data-visualization/project-2023-the-vizards

Lausanne Transportation : a data visualization of the Lausanne Transportation network. Developed by the Vizards team as part of the EPFL Data Visualization course project (COM-480).

buses data-analysis data-science data-visualization epfl lausanne map metro public-transport public-transportation switzerland webgl

Last synced: 01 May 2026

https://github.com/henrylin03/china-gdp

Analysis and visualisation of China GDP data using Python.

data data-analysis data-visualisation dataset kaggle pandas

Last synced: 01 May 2026

https://github.com/ndohvich/ibm-data-science-professional-certificate

Kickstart your career in data science & ML. Build data science skills, learn Python & SQL, analyze & visualize data, build machine learning models. No degree or prior experience required.

coursera dash data-analysis data-science html5 ibm ibm-professional-certificate javascript machine-learnng python sql

Last synced: 16 Nov 2025

https://github.com/dangerousfish/uk-climate-trends-dashboard-metoffice

A data pipeline and Streamlit dashboard that aggregates, cleans and visualises historical UK Met Office station data - interactive charts, heatmaps and maps for temperature, rainfall and sunshine.

climate climate-analysis climate-change climate-data climate-science data-analysis data-visualization metoffice metofficeweather streamlit temperature weather

Last synced: 02 May 2026

https://github.com/kurosawaxyz/covid4eu-sorbonne

Economy: “Analysis of Labor Market decisions of men and women during the COVID-19 pandemic in the 4EU+ countries”.

covid-19 data-analysis data-science data-visualization pandas

Last synced: 04 Jul 2025

https://github.com/jossimmar/ensa-scripts_py

Repositorio destinado al manejo de datos de consumo de los Clientes Mayores de ENSA del Grupo Distriluz.

data-analysis electrical-engineering python sqlite

Last synced: 10 May 2026

https://github.com/zeynepcol/data-analysis-visualization

Data visualization and interactive analytics - Olympics Dataset

data-analysis data-science data-visualization matplotlib pandas plotly python scipy seaborn streamlit

Last synced: 03 May 2026

https://github.com/0xjeremy/me-18-final

Data collection and Analysis tools for IMUs

data-analysis imu raspberry-pi

Last synced: 03 May 2026

https://github.com/selcuk05/forbes_top_100_celebrities_data_analysis

Forbes Top 100 Celebrities since 2005 Data Analysis and Visualization

data-analysis data-science

Last synced: 11 Oct 2025

https://github.com/quantitext/quantitext

Official repository for QuantiText applications in the .NET ecosystem.

api aspnet-core csharp data-analysis dotnet-core mvc-architecture

Last synced: 30 Mar 2025

https://github.com/mariam-badr-mb/gtc-ml-project1-hotel-bookings

The goal of this project is to build a robust data preprocessing pipeline for a hotel booking cancellation prediction model. The focus is not on training the final machine learning model but on ensuring that the dataset is clean, consistent, and ML-ready.

cleaning-data data-analysis exploratory-data-analysis

Last synced: 05 Sep 2025

https://github.com/ahmad-ali-rafique/handwritten-digit-recognition-mnist

This project demonstrates a complete pipeline for recognizing handwritten digits using the MNIST dataset. The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation of a Fully Connected Neural Network (FCNN).

ai artificial-intelligence data data-analysis datascience deep-learning deep-neural-networks fcnn fully-connected-network machine-learning machine-learning-algorithms ml modeling

Last synced: 09 Jun 2026

https://github.com/angelgardt/wlm-sdarp-old

World of Linear Models: Statistics & Data Analysis in R for Psychologists

data-analysis data-visualization gh-pages manim-animations quarto r rstudio statistics

Last synced: 04 May 2026

https://github.com/ehtisham-sadiq/building-an-ml-based-heart-disease-diagnosis-system-with-flask

It is an end-to-end project that combines machine learning to create a user-friendly Heart Disease Diagnosis System, powered by Flask.

data-analysis exploratory-data-analysis feature-engineering flask machine-learning model-building model-evaluation pipelines python3 rest-api

Last synced: 04 May 2026

https://github.com/shuddha2021/stellar-candidate-selector

A sophisticated candidate selection algorithm leveraging multi-criteria analysis and machine learning to identify top software engineering candidates. This tool features flexible filtering, score adjustment, and detailed visualizations to streamline the recruitment process.

candidate-selection data-analysis data-visualization machine-learning pandas plotting-in-python python python-data-analysis recruitment scikit-learn

Last synced: 05 May 2026

https://github.com/listiangr/ecommerce_sales_data_analysis

Proyek ini menganalisis data penjualan e-commerce untuk membantu bisnis memahami tren penjualan, performa produk, dan segmen pelanggan. Tujuan utamanya adalah memberikan wawasan yang dapat meningkatkan strategi pemasaran dan pengelolaan produk.

dashboard data-analysis data-cleaning data-collection data-penjualan data-visualization exploratory-data-analysis microsoft-excel

Last synced: 19 Jan 2026

https://github.com/avijit-jana/redbus-data_scraping_and_filtering_with_streamlit_app

A Streamlit-based application leveraging Selenium to automate data scraping from Redbus, enabling efficient collection, analysis, and visualization of bus travel data for improved operational efficiency and strategic planning in the transportation industry.

automation dashboard data-analysis data-visualization datadrivendecisions python3 redbus selenium-python streamlit-application webscrapping

Last synced: 15 Mar 2025

https://github.com/geetisha/sales_insight_data_analysis_using_sql_and_tableau-etl-

Sales Insights - A Data Analysis Project performed on Tableau & SQL Topics

dashboard data-analysis data-visualization mysql project sales-analysis sql tableau

Last synced: 07 Jan 2026

https://github.com/whitehathackerpr/data-visualization-tool

This is a Python-based web application that allows users to upload datasets, analyze data, and create visualizations interactively. The tool is designed for ease of use and provides a simple interface to perform basic data analysis and generate visualizations

data data-analysis data-visualization python python3

Last synced: 05 Sep 2025

https://github.com/mirokeimioniemi/classifying-software-pirates

Exploring the factors driving people into software piracy by training two machine learning models to predict whether a person with certain characteristics and sentiments is likely to possess any pirated software or not using a dataset collected via a survey targeting users of music production software.

data-analysis data-science decision-tree-classifier logistic-regression machine-learning piracy python software-piracy survey

Last synced: 06 May 2026

https://github.com/scarblase/homeless-animals-analysis

A data-driven exploration of homeless animal statistics 🐶🐱. Analyze age distribution, shelter dynamics, and adoption patterns using Python, Pandas, and Seaborn.

animals data-analysis data-mining data-science data-science-projects data-visualization matplotlib matplotlib-pyplot numpy pandas plotly python python3 ukraine

Last synced: 06 May 2026

https://github.com/kirkalyn13/opensignal_autogenerate_report

Script used to generate results/summary, including the trends of flagged provinces, from the raw excel data file,

data-analysis data-science data-visualization matplotlib numpy pandas python

Last synced: 06 May 2026

https://github.com/mrankitgupta/titanic-survival-prediction-93-xgboost

Titanic Survival Prediction Project (93% Accuracy)🛳️ In this notebook, The goal is to correctly predict if someone survived the Titanic shipwreck using different Machine Learning Model & Hyperparameter tunning.

classification data-analysis data-science data-visualization gradient-boosting kaggle-competition linear-regression logistic-regression machine-learning machine-learning-algorithms ml ml-models nlp prediction predictive-modeling random-forest titanic titanic-kaggle titanic-survival-prediction xgboost

Last synced: 06 May 2026

https://github.com/scarblase/portfolioprojects

A collection of data analysis and business intelligence projects using SQL, Python, and visualization tools to uncover insights from real-world datasets. 🚀📊

csv data-analysis data-engineering data-mining data-science data-visualization matplotlib matplotlib-pyplot pandas python python3 seaborn sql

Last synced: 06 May 2026

https://github.com/agustinmusanti/delitosencaba-proyectofinal-dataanalytics-coderhouse

En este repositorio muestro mi proyecto final en el curso "Data Analytics" de Coderhouse.

data-analysis excel powerbi

Last synced: 22 Jan 2026

https://github.com/rkolehov/retail-sales-analysis-project

End-to-end e-commerce analysis showcasing SQL and data visualization skills. Tracks sales, customer behavior, product performance, and delivery efficiency. Interactive dashboards provide actionable insights for business decision-making

analytics dashboard data-analysis ecommerce jupyter-notebook postgresql python sql tableau vscode

Last synced: 19 Apr 2026

https://github.com/md-emon-hasan/data-science

Data science tutorials, including data preprocessing, analysis, visualization, project deployment, machine learning and deep learning algorithms.

artificial-intelligence data-analysis data-engineering data-science deep-learning machine-learning-algorithms python

Last synced: 07 May 2026

https://github.com/nafisalawalidris/springforth-university-foodbank

Springforth University Food Bank: A collaborative initiative with UNESCO to address student food insecurity. Contains code and resources for the web application, data analysis, and insights into the prevalence and impact of food insecurity on academic performance.

academic-performance collaborative-initiative data-analysis data-visualization excel pivot-tables powerbi springforth-university-food-bank student-food-insecurity unesco

Last synced: 17 Feb 2026

https://github.com/geo-y20/loan-approval-automation-using-mongodb-and-pymongo

This project demonstrates the implementation of a loan approval system that utilizes MongoDB for distributed data storage and management, and PyMongo for database operations. The project aims to automate the assessment of loan eligibility using customer details from online applications.

crud-application data data-analysis data-science data-visualization deployment jupyter-notebook loan-default-prediction loan-prediction-analysis machine-learning machine-learning-algorithms matplotlib mongodb pymongo streamlit web

Last synced: 08 May 2026

https://github.com/shridhar1504/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization datanalysis eda kmeans-clustering machine-learning python sql sql-server unsupervised-learning

Last synced: 08 May 2026

https://github.com/jethronap/jstat-gui

Web-based GUI application for data analysis

data-analysis data-visualization java jstat mongodb

Last synced: 08 May 2026

https://github.com/aekanshd/crazytics-suicidesindia

Basic interpretation of the Suicides in India data-set using R.

data-analysis data-science graph india r suicides

Last synced: 10 Jun 2026

https://github.com/akshat0427/python_youtube_history

a bunch of data science operations performed on youtube history data

data-analysis data-science extracting-features

Last synced: 10 Jun 2026

https://github.com/rubinlake/rl-academy-data-analytics

Educational data analysis project demonstrating BMW sales data analysis with AI-powered code assistance using Cursor IDE and Jupyter notebooks

cursor-ide data-analysis educational-project jupyter langchain matplotlib numpy pandas python scipy seaborn

Last synced: 09 May 2026

https://github.com/jinkogule/multi-analyst

O Multi Analyst é uma ferramenta de análise de dados com uma usabilidade simples, que utiliza inteligência artificial para interpretar os resultados das análises realizadas, retornando insights úteis aos usuários.

apriori-algorithm bootstrap css data-analysis django html numpy open-ai pandas python web-application

Last synced: 12 Apr 2026

https://github.com/lightbridge-ks/zoominterface

A data analysis Shiny app of program Zoom report files.

data-analysis r shiny-apps zoom-class zoom-meetings

Last synced: 01 Jun 2026

https://github.com/zpreisler/modules

Python libraries and modules for processing simulation outputs

data-analysis python scripts tensorflow

Last synced: 13 May 2026

https://github.com/pferreirafabricio/data-immersion

🏊🏻‍♂️ Activities and exercises from 'Imersão Dados' event

data data-analysis data-science dataset jupiter-notebook python

Last synced: 14 May 2026

https://github.com/atymri/linqsimulator

LINQ Simulator is an interactive C# console application designed to let you experiment with LINQ queries in real time.

console csharp data data-analysis linq query sql

Last synced: 23 Oct 2025

https://github.com/sunnybibyan/random_data_generation

A project that generates a dataset using various statistical distributions (Normal, Uniform, Exponential, Random Integers, and Binomial) and performs data analysis. Includes visualizations and an option to export the data as a CSV file.

data-analysis data-visualization python random-data-generation statistics streamlit-webapp

Last synced: 13 Jun 2026

https://github.com/viper373/163-buff

爬取网易BUFF平台CS:GO武器皮肤交易数据

163 arima crawler-python csgo data-analysis prediction python

Last synced: 24 Oct 2025

https://github.com/ensinho/data-analysis

My repository for data analysis studys in Python.

csv data-analysis graphs python python-documentation

Last synced: 15 Jun 2026

https://github.com/tunjis/global-superstore_dashboard_tableau

Tableau dashboard with 4 different types of visualisations

charts dashboard data-analysis data-visualisation excel tableau

Last synced: 23 Jan 2026

https://github.com/as16082023/atliq-hospitality-analysis

This project presents an overview of AtliQ Grands' performance in the hospitality industry using Power BI.

atliqgrand codebasicsresumeprojectchallenge data-analysis data-visualization powerbi revenueinsights

Last synced: 23 Jan 2026

https://github.com/duoan/machine-learning-notebook

A notebook repository for tracking learning machine learning notebook.

data-analysis decision-tree ensemble-model gbdt machine-learning numpy pandas xgboost

Last synced: 18 Jun 2026

https://github.com/jmssnr/shuffle-kit

shuffle-kit: model and analyze playing card shuffles in Python

data-analysis playing-cards python shuffle statistics

Last synced: 19 Jun 2026

https://github.com/aaryan-agr/canadian-energy

This project analyzes Canada's energy trade, focusing on imports, exports, and market trends in the energy sector.

data-analysis data-cleaning data-manipulation data-processing data-science data-vizualisation energy-sector time-series-analysis

Last synced: 10 Jun 2025

https://github.com/ayenpure/stockmeup

This is a class project for 'CIS 610 : Data Science' where I try and validate Stock Market recommendations.

data-analysis data-mining data-science java mapreduce mapreduce-java

Last synced: 24 Oct 2025

https://github.com/alicankaya192/world-happiness-report-2025

Comprehensive exploratory data analysis (EDA) and visualization of the World Happiness Report 2025. Analyzes global rankings, regional distributions, key happiness factors, and detects wealth-happiness paradox outliers using Python (Pandas, Matplotlib, SciPy).

correlation-analysis data-analysis data-science data-visualization eda exploratory-data-analysis global-happiness happiness-index matplotlib pandas python scipy statistics whr-2025 world-happiness-report

Last synced: 21 Jun 2026

https://github.com/ryanfranklin237/data-cleansing

A group of python scripts that clean large data sets by removing duplicate data, putting data in correct formats, and removing redundant cells

data-analysis data-cleaning data-science extract-transform-load pandas-dataframe python

Last synced: 23 Jun 2026

https://github.com/jabhij/fbi_nics-firearm-background-checks

This project is a try to showcase the use of guns across the US.

data-analysis data-analytics data-science data-visualization tableau

Last synced: 23 Feb 2026

https://github.com/infinitode/duplipy

DupliPy is a quick and easy-to-use package that can handle text formatting and data augmentation tasks for NLP in Python. It now offers support for image augmentation tasks as well.

ai augmentation data-analysis data-preprocessing data-science images language-models nlp preprocessing text-data text-datasets text-formatting

Last synced: 28 Jun 2026

https://github.com/maddieemihle/python-challenge

Creating a Python script that analyzes financial records and election results

data-analysis python

Last synced: 09 Jun 2026

https://github.com/ankitgmishra/machinelearning

Continuously deep diving in understanding & advancing my expertise in Machine Learning through ongoing education and hands on experience with practical learning.

artificial-intelligence data-analysis data-cleaning data-gathering machine-learning machinel-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 03 May 2026

https://github.com/matteospanio/speed-analysis

A project to analyze the internet speed

bash-script data-analysis

Last synced: 03 May 2026

https://github.com/bpkaur/whats-in-a-name

Exploring dataset of first names of babies born in the US in order to uncover interesting stories

data-analysis datacamp numpy pandas python3

Last synced: 04 May 2026

https://github.com/mindlessmuse666/titanic-data-visualization

Проект по визуализации данных о пассажирах Титаника с использованием библиотек Python Matplotlib, Seaborn и Plotly.

data-analysis data-visualization matplotlib pandas plotly python seaborn titanic

Last synced: 04 May 2026

https://github.com/arv-anshul/ipl-api

IPL API using Flask framework and ipl dataset.

api data-analysis fast-api flask flask-api ipl ipl-api python3

Last synced: 04 May 2026

https://github.com/ibrahimm7004/supermarket-sales-analysis

This project focuses on Data Mining techniques to gather inisights about customer behaviour regarding Supermarket Sales. Includes: Association Rule Mining, Temporal Patterns in customer behavior, Sequential Pattern Mining, Classification, Regression, and Outlier Detection.

apriori association-rules data-analysis data-mining data-science data-visualization fpgrowth python sales-analysis supermarket-sales

Last synced: 04 May 2026

https://github.com/jendives2000/regressions

Performing of a Linear Regression analysis to determine the strength of the relationship between the number of reviews and sales for a retail company.

data-analysis linear-regression pearson-correlation-coefficient regression

Last synced: 04 May 2026

https://github.com/matt-ags/jornada-python

Repositório com os projetos realizados durante a semana "Jornada Python" - 01/2025

artificial-intelligence automation data-analysis jupyter-notebook machine-learning python

Last synced: 05 May 2026

https://github.com/rtlich/sap-sustainable-management

Project for the ERP & BI course at Esprit School of Engineering. It optimizes resource and operations management in an agri-food company using SAP MM & PM, focusing on sustainability, CO₂ reduction, and predictive maintenance.

angular business-intelligence data-analysis flask machine-learning ocr powerbi python sql-server talend

Last synced: 05 May 2026

https://github.com/sajjad425/edaipl

The dataset covers the Indian Premier League (IPL) with details on matches (date, teams, venue, results), player stats (runs, wickets), team stats (wins, losses), season summaries, and umpire info. The EDA reveals patterns and insights, highlighting dominant teams, star players, and trends across seasons.

data-analysis eda exploratory-data-analysis ipl python

Last synced: 05 May 2026

https://github.com/monish-nallagondalla/universal-bank

Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.

classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn

Last synced: 05 May 2026

https://github.com/akash-47-tank/personalized-e-commerce-review-summarizer

Personalized E-commerce Product Review Summarizer: A Streamlit app that summarizes product reviews (e.g., from a CSV) using T5-small and tailors summaries to user preferences (price, durability, etc.) with NLP and lightweight ML.

data-analysis e-commerce machine-learning nlp personalization portfolio python scikit-learn sentiment-analysis streamlit t5 transformers web-app

Last synced: 05 May 2026

https://github.com/aryar-06/linear-regression

A Python project demonstrating basic linear regression with gradient descent and matrix operations, alongside scikit-learn comparison.

data-analysis data-preprocessing educational-project gradient-descent linear-regression machine-learning python regression-algorithms scikit-learn

Last synced: 05 May 2026

https://github.com/nimbostratos/titanic-survival-prediction

Machine learning project predicting Titanic survival using AdaBoost with feature engineering and hyperparameter optimization

data-analysis data-science data-science-projects kaggle machine-learning machine-learning-models python scikit-learn

Last synced: 05 May 2026

https://github.com/hms75/movie_rating_analysis

A movie rating analysis which identifies trends amongst a dataset of 5000 movies.

data-analysis data-visualization matplotlib-pyplot numpy pandas python

Last synced: 05 May 2026

https://github.com/superpandas-ai/superpandas

Adding LLM integration to Pandas library

ai data-analysis llm pandas

Last synced: 06 May 2026

https://github.com/edanur-y/variable-analysis-of-banks-ratio-data

Testing variables for multicollinearity, multivariate normality and analyzing outliers and missing values. ⭕SPSS 🔵R

data-analysis log-transformation missing-values-analysis multicollinearity normality-test r spss

Last synced: 10 Jun 2026

https://github.com/suhas-005/jovian-data-analysis-course-assignment

These are my assignments for Data Analysis : Zero to Pandas course by Jovian.ai

data-analysis data-analytics numpy pandas python

Last synced: 07 May 2026

https://github.com/rohansoni45/whatsapp-chat-analysis

This project involves analyzing WhatsApp chat data to extract valuable insights. Using Python and various libraries like Pandas and Matplotlib, the project processes and visualizes chat statistics such as message frequency, most active participants, and sentiment analysis.

chat-analysis data-analysis data-science matplotlib pandas python sentiment-analysis streamlit visualization web-app word-cloud

Last synced: 07 May 2026

https://github.com/jjkay03/discord-call-extractor

Collect HTML data from Discord group/DM to create database of calls

data-analysis database discord discord-tool

Last synced: 07 May 2026

https://github.com/biginformatics/git-basics

Hands-on Git and GitHub lessons for analysts and statisticians

data-analysis git github public-health training

Last synced: 10 Jun 2026