An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/marielachirinosr/bellabeat-wellness-data-trends

Analyzing smart device data for insights on user activity patterns to optimize interventions for better health outcomes.

data data-analysis data-visualization pandas python python3 tableau tableau-public

Last synced: 25 Apr 2026

https://github.com/edwinrlambert/investigating-netflix-movies

Demonstrates data analysis and visualization techniques for Netflix movies using Python in a Jupyter notebook. This is a DataCamp project.

data-analysis data-analysis-python netflix python

Last synced: 25 Apr 2026

https://github.com/devexpress-examples/winforms-create-a-custom-exporter-for-pivotgridcontrol-with-xtrareport

This example illustrates how to dynamically create a custom report based on PivotGridControl content in WinForms.

data-analysis dotnet pivot-grid pivot-grid-for-winforms winforms

Last synced: 26 Apr 2026

https://github.com/devexpress-examples/wpf-pivotgrid-customize-the-cell-template

This example demonstrates how to customize the cell appearance in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 26 Apr 2026

https://github.com/sambit-mondal/stockx

StockX is a full-stack application designed to help store owners efficiently manage their inventory, track purchases, and analyze stock levels. The system integrates MongoDB, Express, React, and Flask (Python) to provide a seamless experience.

artificial-intelligence data-analysis inventory-management-system machine-learning mern-stack

Last synced: 12 Jun 2026

https://github.com/rociobenitez/happiness-index-data-processing

Repository for Big Data Processing - Contains Jupyter Notebooks and Datasets for data analysis and processing tasks related to Big Data.

big-data big-data-processing data-analysis data-processing happiness-index happiness-report jupyter-notebook matplotlib pandas seaborn

Last synced: 15 May 2026

https://github.com/warazkhan/airplane-crashes-and-fatalities-since-1908-

This project analyzes airplane crash data (1908 - 2008)✈️📊 to uncover trends in aviation accidents, fatalities, and safety improvements. Using exploratory data analysis (EDA) and data visualization, we examine key factors influencing crashes, identify high-risk regions, and explore advancements in aviation safety.

data-analysis data-visualization exploratory-data-analysis

Last synced: 10 Jun 2026

https://github.com/deliprofesor/cinematic-data-analytics-and-recommendation-platform

This project analyzes a movie dataset using machine learning algorithms to predict success, explore revenue-popularity relationships, and develop recommendation systems. It employs techniques like K-Means, DBSCAN, GMM, decision trees, PCA, and NLP for insights and personalized suggestions.

clustering content-based-recommendation data-analysis data-visualization decision-tree gmm k-means machine-learning natural-language-processing nlp pca predictive-modeling python recommendation-system scikit-learn user-based-recommendation

Last synced: 26 Apr 2026

https://github.com/moshora99/sql-data-warehouse-project

Build modern data warehouse with mysql, Including ETL processes, data modeling and analytics

data-analysis data-engineering data-science database datawarehouse datawarehousing etl scheme sql sql-query sql-server

Last synced: 27 Apr 2026

https://github.com/akashvarma26/data-analysis-on-imbd-using-sqlite3

Data Analysis on IMDb dataset using sqlite3 and Pandas in Jupyter notebook.

data-analysis jupyter-notebook pandas-dataframe sqlite

Last synced: 27 Apr 2026

https://github.com/devexpress-examples/web-forms-pivot-grid-calculate-running-totals

This example demonstrates how to calculate running totals in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 08 Aug 2025

https://github.com/mumtaz4118/amazon-iphone-12-data-scrapped

Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.

data-analysis data-extraction data-science data-scraping html mark-up python

Last synced: 27 Apr 2026

https://github.com/jjkay03/discord-call-extractor

Collect HTML data from Discord group/DM to create database of calls

data-analysis database discord discord-tool

Last synced: 07 May 2026

https://github.com/busesimsek/sql-projects

A collection of my SQL projects with insights into real-world datasets.

data-analysis data-analytics mysql sql

Last synced: 07 Jun 2026

https://github.com/garcane/exodus_analysis

This project analyses cryptocurrency transaction data exported from the Exodus wallet. The goal is to explore and visualize the inflows and outflows of assets, the types of transactions, and other key metrics over time.

bitcoin btc crypto cryptocurrencies cryptocurrency data-analysis data-visualization eth ethereum pandas seaborn

Last synced: 27 Apr 2026

https://github.com/caesaredia/food-app-user-behavior-analysis

Analyze user behavior and optimize app experience in a food-tech startup through funnel analysis and A/A/B testing. Includes data prep, visualization, and statistical testing in Python.

a-b-testing chi-square data-analysis data-visualization funnel-analysis python statistical-testing user-behavior

Last synced: 27 Apr 2026

https://github.com/debjyotisaha/data-analytics-projects-phase-2

Developed and showcased various data analytics projects, including data preprocessing, exploratory data analysis, and visualization. Utilized tools such as Python, Pandas, NumPy, and Matplotlib to derive actionable insights and demonstrate problem-solving capabilities.

data-analysis data-preprocessing eda matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/josedanielchg/1990s-netflix-movie-insight

Small exploratory analysis of Netflix movie data from the 1990s. This project is part of the DataCamp Associate Data Scientist in Python program and focuses on filtering, visualizing, and extracting insights from a dataset using Python. Analyze trends in movie durations and count short action films to practice key data science skills!

beginner data-analysis python

Last synced: 27 Apr 2026

https://github.com/tillscode/personal-finance-ml-analysis

Machine learning analysis of personal financial data with predictive modeling and interactive dashboard

dashboard data-analysis finance machine-learning python scikit-learn

Last synced: 28 Apr 2026

https://github.com/bryanhe24/data_analysis_app

A full-stack web application that allows users to upload CSV datasets, analyze the data with statistical summaries and visualizations, and interact with an AI-powered assistant for querying the dataset.

ai data data-analysis data-visualization fullstack-development javascript math python reactjs

Last synced: 07 May 2026

https://github.com/mahmoudnamnam/fc-barcelona-reports

FC Barcelona Reports: An interactive web application to analyze and visualize FC Barcelona's match data. Built with Streamlit, it scrapes match data from WhoScored, stores it in MongoDB, and presents insights through interactive visualizations like pass networks, shot maps, and player statistics.

data-analysis data-visualization football-analytics mplsoccer pandas streamlit web-scraping

Last synced: 07 May 2026

https://github.com/hadson0/chess-live-ratings-data

A study project focused on web scraping the live chess ratings from chess.com, with data analysis and visualization on nearly 5000 players in the classical world ranking.

beautifulsoup chess data-analysis data-visualization numpy pandas python seaborn web-scraping

Last synced: 28 Apr 2026

https://github.com/emmanuelletocs/steam-game-recommender

A powerful recommendation system for Steam games, combining Content-Based and Collaborative Filtering techniques. Built with Python, Scikit-learn, and Streamlit to deliver accurate, real-time game recommendations. Perfect for gamers and data scientists interested in building intelligent recommendation engines.

als-algorithm data-analysis gaming-industry knn machine-learning mds mysql ncf neural-network pyspark recommendation-engine recommendation-system scikit-learn spark

Last synced: 28 Apr 2026

https://github.com/elmezianech/autoinventory

This project is an end-to-end, fully automated warehouse management solution designed to tackle real-world inventory challenges in the FMCG sector. From real-time data ingestion and predictive analytics to interactive dashboards, this project combines cutting-edge technologies and an event-driven architecture to simulate a business-ready system.

automation dashboard data-analysis data-engineering-pipeline docker etl glue-job inventory-management kafka kpis lambda-functions lstm ml-pipeline mlflow power-bi pytorch redshift s3 streamlit warehouse-management

Last synced: 28 Apr 2026

https://github.com/abdeldjalilchafai/us-flight-delay-eda

Structured EDA on 2015 US flight delay data. Clean, reproducible notebook using a 6-step data analysis framework for real-world datasets.

data-analysis data-cleaning eda exploratory-data-analysis flight-delays kaggle matplotlib numpy pandas python seaborn

Last synced: 28 Apr 2026

https://github.com/sufyan14/weather-data-analysis

A Streamlit dashboard that forecasts 30-day weather trends using uploaded CSV data and Facebook Prophet.

data-analysis python streamlit

Last synced: 28 Apr 2026

https://github.com/shreeparab1890/indian-elections-2019-analysis-eda

This ipython notebook is the Exploratory data analysis (EDA) of the Indian Lok Sabha Elections 2019.

data data-analysis data-science data-visualization eda exploratory-data-analysis matplotlib numpy pandas plotly python python3 visualization

Last synced: 28 Apr 2026

https://github.com/matheusafonseca/python-data-visualization-matplotlib-seaborn-masterclass-udemy

This repository is dedicated to storing the code developed during the "Python Data Visualization: Matplotlib & Seaborn Masterclass" course on Udemy.

charts data-analysis data-analysis-python data-science data-visualization database graphics graphics-programming jupyter-notebook matplotlib matplotlib-plots python python3 seaborn seaborn-plots

Last synced: 28 Apr 2026

https://github.com/rorrell/coviddeaths

A Jupyter Notebook where I create several visualizations based on data about COVID-19 deaths from 2020 to 2024

data-analysis data-visualization jupyter-notebook python3

Last synced: 28 Apr 2026

https://github.com/v41bh4vr4jput/data-analysis-with-python

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.

api data data-analysis data-visualization matplotlib numpy pandas python sakila-db seaborn

Last synced: 09 Apr 2026

https://github.com/omdoshi13/pricing-of-laptops-using-ml

Data Analysis, training Machine Learning models, and Model Evaluation and Refinement for Pricing of Laptops dataset.

data-analysis data-analysis-project datascience google-colab jupyter-notebook machine-learning matplotlib model-evaluation model-refinement numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/fortunewalla/birdstrikes

birdstrikes database created for postgresql with simple sample queries

birdstrikes csv data-analysis data-science database dataset pgsql postgresql practice sample sql sql-query workshop

Last synced: 02 Oct 2025

https://github.com/ericdataplus/kaggle-airbnb-nyc

NYC Airbnb Market Analysis: Multi-source from 2 Kaggle datasets (151K listings)

airbnb data-analysis kaggle nyc python visualization

Last synced: 28 Apr 2026

https://github.com/wei-rongrong2/openfoodfactclustering

A project that explores clustering food products based on nutritional attributes using K-Means, Fuzzy C-Means, and DBSCAN algorithms, with a Streamlit dashboard for visualizing results.

clustering dashboard data-analysis dbscan food-products fuzzy-cmeans k-means machine-learning nutrition nutrition-clustering open-food-facts streamlit

Last synced: 28 Apr 2026

https://github.com/ozep/genshincharacteranalysis

Uses a spreadsheet with Character Data and organizes it into readable graphs.

data-analysis jypyternotebook python

Last synced: 18 Apr 2026

https://github.com/joseph-pabian/life-expectancy-

Statistical analysis of life expectancy in developed vs developing countries using SQL and Python

data-analysis duckdb public-health python sql statistics

Last synced: 07 May 2026

https://github.com/gaurav-van/optimizing-rate-of-penetration-in-geothermal-drilling-a-digital-twin-approach

Let’s explore something interesting together. In this project, we developed a machine learning digital twin using Intel-optimized XGBoost and daal4py to simulate and optimize the Rate of Penetration (ROP) in geothermal drilling. We leveraged SHAP for Explainable AI (XAI) to interpret model predictions.

data-analysis data-science digital-twin explainable-ai geothermal geothermal-energy jupyter-notebook machine-learning python shap xai xgboost

Last synced: 28 Apr 2026

https://github.com/josedanielchg/efficient-data-storage-for-predictive-modeling

DataCamp project from the Associate Data Scientist track, focusing on optimizing dataset storage by transforming data types and filtering. Prepares data for efficient machine learning workflows

cleaning-dataset data-analysis jupyter-notebook python

Last synced: 28 Apr 2026

https://github.com/kisaa-fatima/data-visualization-with-tableauleu

Conducted Exploratory Data Analysis (EDA) on the Berkeley Earth Dataset (large scale dataset), which features high-resolution land and ocean time series data. Created interactive dashboards using Tableau to effectively visualize and highlight trends and patterns within the data.

data-analysis data-science exploratory-data-analysis insights python tableau visualizations

Last synced: 29 Apr 2026

https://github.com/prady2309/sales-prediction-using-python

Implemented using Multiple Linear Regression

data-analysis data-science machine-learning python

Last synced: 29 Apr 2026

https://github.com/devexpress-examples/web-forms-pivot-grid-change-summary-display-mode

This example shows how to use different summary display modes in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 29 Apr 2026

https://github.com/chrispsang/healthcare-dataanalysis

Analyze synthetic patient data to identify trends, improve healthcare delivery, and predict patient outcomes using machine learning models. Includes data exploration, preprocessing, model building, and visualizations.

data-analysis data-science data-visualization healthcare jupyter-notebook machine-learning python

Last synced: 29 Apr 2026

https://github.com/marcinz20/anomaly-detection-in-credo-dataset

University project, which goal is to build a system, that detects anomalies in CREDO dataset

credo data-analysis data-science encoder-decoder-model jupiter-notebook pca-analysis python3

Last synced: 29 Apr 2026

https://github.com/prateek5525/yt-analysis-project

This project utilizes the YouTube Data API to analyze channel and video performance, offering insights into subscriber counts, views, video metrics, and monthly trends. It generates visual reports and exports data in CSV format, aiding in effective decision-making and performance tracking.

data-analysis jupyter-notebook python3 seaborn-plots youtube-api

Last synced: 29 Apr 2026

https://github.com/vanshuchaudhary/zomato

This Jupyter Notebook contains an exploratory data analysis (EDA) of Zomato restaurant data. It includes data cleaning, visualization, and insights into restaurant ratings, pricing, cuisine distribution, and location-based trends.

business-analytics data-analysis data-mining data-science data-visualization datascience matplotlib pandas-dataframe pandas-python python python-3 python-library

Last synced: 29 Apr 2026

https://github.com/kasraskari/learn-r-codes

A learning repository for R programming, covering data manipulation, visualization, and statistical analysis. (Work in progress!) 🚧

data-analysis data-analysis-r data-visualization r r-examples r-graphics r-statistics statistics

Last synced: 08 Jun 2026

https://github.com/satyacoder29/crowdfunding-in-sql

Crowdfunding is a method of raising funds for projects or causes by collecting small contributions from a large group of people, usually through online platforms. It enables individuals, startups, and nonprofits to secure funding, offering rewards or recognition in exchange, and helps bring ideas to life without traditional financing.

data-analysis data-cleaning database-management mysql-database quries sql sql-functions sql-server views

Last synced: 29 Apr 2026

https://github.com/anilyigitsel/istanbul-rental-apartments-analysis

This project analyzes the Istanbul Rental Apartments Dataset (2025), which includes rental apartment listings from Istanbul, Turkey.

data-analysis data-visualization jupyter-notebook matplotlib pandas python rental-housing

Last synced: 29 Apr 2026

https://github.com/ddihora1604/advanced_business_analytics_on_world_bank_global_financial_inclusion_data_2021

Bridging the Gaps in Financial Inclusion: Understanding the Cash-Credit Paradox, Divide between Cash and Digital Payments, and Financial Resilience.

advanced-excel business-analytics data-analysis data-engineering data-mining data-visualization database exploratory-data-analysis machine-learning preprocessing-data python

Last synced: 07 May 2026

https://github.com/dcs-training/network-analyisis-python

Course material for introducing data visualization with Altair and network analysis with NetworkX (in Python). Go to the readme file

data-analysis data-visualisation network-analysis python text-analysis

Last synced: 29 Apr 2026

https://github.com/marialuizaleitao/walmartsalesanalysis

This project explored data collection and preprocessing, advanced application of SQL queries, and feature engineering. Key calculations, such as COGS (Cost of Goods Sold) and VAT (Value Added Tax), were performed to assess the profitability and financial efficiency of the branches.

business-analytics data-analysis mysql-database sql

Last synced: 13 Jun 2026

https://github.com/i7t5/sentimentnlp

Sentiment analysis for COMP 435 Introduction to Machine Learning, Spring 2025

data-analysis jupyter-notebook machine-learning nlp python sentiment-analysis

Last synced: 29 Apr 2026

https://github.com/findmyway/dataframe-in-julia

A quick introduction of DataFrame in Julia for users from Python

data-analysis dataframe julia jupyter-notebook

Last synced: 29 Apr 2026

https://github.com/gmalbert/immigration

Immigration Data Analysis

data-analysis immigration

Last synced: 14 Jun 2026

https://github.com/fatihilhan42/starbucks_analysis_turkey_and_world_with_python

In this project, firstly the brands for coffee in the world and then these brands in Turkey were examined. The data from the dataset, which you can find in the repo, was first organized using data cleaning algorithms. These cleaned data were then graphically extracted using data visualization algorithms.

data-analysis data-cleaning data-science data-visualization jupyter-notebook python

Last synced: 29 Apr 2026

https://github.com/acerbilab/svbmc

Stacking Variational Bayesian Monte Carlo (S-VBMC) algorithm for combining Variational Bayesian Monte Carlo (VBMC) posteriors to boost inference performance.

bayesian-inference data-analysis machine-learning model-fitting python stacking variational-inference

Last synced: 20 Jan 2026

https://github.com/mfakhriazhar/python-data-analyst-tutorial

A collection of My Python learning files for Data Analyst purposes. Covers fundamental to advanced topics such as data exploration, visualization, statistical analysis, and the use of popular libraries like Pandas, NumPy, Matplotlib, and Seaborn. Suitable for personal documentation or shared learning references.

data-analysis data-science data-visualization exploratory-data-analysis portfolio python

Last synced: 29 Apr 2026

https://github.com/farhad-here/textprepx

A Multilingual Text Preprocessing Tool for English and Persian.

cleantext contractions data-analysis deep-learning emoji nlp nltk opp parsivar regex streamlit text-preprocessing textblob

Last synced: 29 Apr 2026

https://github.com/chandantech2023/sales-trend-analysis

This repository features the Superstore Sales Analysis project, demonstrating data cleaning and analysis using Python and SQL, along with interactive visualization in Power BI. .

data-analysis data-science dax kaggle powerbi-desktop python3 sql

Last synced: 29 Apr 2026

https://github.com/valikmorinko/ecommerce-sales-analysis

Анализ продаж e-commerce: данные, визуализации, аналитические выводы.

data-analysis e-commerce jupyter matplotlib pandas python seaborn

Last synced: 29 Apr 2026

https://github.com/sdley/cas_pratique-del_annuel

Del-Annuel est logiciel de deliberation annuelle des ecoles superieures ou universités

data-analysis pandas python tkinter-gui

Last synced: 29 Apr 2026

https://github.com/alunera-data/sql-use-cases

Practical SQL use cases for Business Intelligence and IT Service Management (BI & ITSM)

business-intelligence dashboards data-analysis data-quality eda itsm kpis postgresql process-monitoring query reporting sql sqlserver

Last synced: 29 Apr 2026

https://github.com/meinhere/ta-pendat

Proyek Akhir Mata Kuliah Penambangan Data - Klasifikasi Trauma Pasien Menggunakan Metode Naive Bayes

data-analysis data-mining naive-bayes-classifier python trauma

Last synced: 29 Apr 2026

https://github.com/PanosChatzi/Healthcare_and_Bioinformatics_Analyses

This repo contains the final assignments of the Data Analyst bootcamp by Workearly. Python and SQL were used to complete the assignments.

data-analysis data-cleaning data-visualisation jupyter matplotlib pandas python seaborn

Last synced: 05 Aug 2025

https://github.com/brevex/code-complexity-data-analisis

Data collection that shows different complexity scores in an algorithmic dataframe.

code-analysis data-analysis data-science python

Last synced: 29 Apr 2026

https://github.com/al-ghaly/e-commerce-a-b-testing

A Statistical Analysis project in which I Performed an A/B test to analyze the effect of changing the user interface for an E-Commerce company's Website.

data-analysis matplotlib numpy pandas python python-data-analysis seaborn statistical-analysis statistics

Last synced: 29 Apr 2026

https://github.com/jpgiant/gujaratrainfallanalysis_2021

Analysis about the rainfall that occurred in the districts of Gujarat state in 2021

data-analysis exploratory-data-analysis exploratory-data-visualizations matplotlib numpy pandas-python python

Last synced: 07 May 2026

https://github.com/yimethan/basics-of-data-analysis

2023-2 Basics of Data Analysis

data-analysis numpy pandas python

Last synced: 29 Apr 2026

https://github.com/muthukumar0908/-singapore-resale-flat-prices-predicting

This project is to develop a machine learning model and deploy it as a user-friendly web application that predicts the resale prices of flats in Singapore.

data-analysis data-visualization mechine-learing plotly python streamlit

Last synced: 07 May 2026

https://github.com/theoplayz2/eda-explorer

Инструмент на Python для разведочного анализа данных (EDA) и визуализации, поддерживающий загрузку данных CSV и JSON, с модульной архитектурой ООП. Практическая работа по теме: "Обнаружение и визуализация данных для понимания их сущности" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

analysis battery-life cqrs csharp data-analysis eeg-analysis exploratorydataanalysis json-visualization matplotlib messaging profile-report python verilog visualization

Last synced: 29 Apr 2026