An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/singhs05/global-youtube-trends

Understand the impact of Likes, comments, dislikes on the video consumption for the videos that were trending.

data-analysis mssqlserver query sql

Last synced: 18 Mar 2026

https://github.com/xre22zax/roller-coaster

Explore award-winning wood and steel coasters from 2013-2018 Golden Ticket Awards & Captain Coaster, all powered by Python and interactive visualizations.

analytics data-analysis data-visualization pandas python python-lambda python3 visualization

Last synced: 20 Apr 2026

https://github.com/soypete/example-go-dataframes-parser

example of https://godoc.org/github.com/kniren/gota/dataframe

data-analysis data-science datastructures golang-examples ml

Last synced: 12 Sep 2025

https://github.com/sarthakmishraa/bike_rental_predictor

Bike Sharing Dataset : This dataset contains the hourly and daily count of rental bikes between years 2011 and 2012 in Capital bikeshare system with the corresponding weather and seasonal information.

data-analysis machine-learning python xgboost

Last synced: 20 Apr 2026

https://github.com/wtbates99/pandas-monday

Python library that provides seamless integration between pandas DataFrames and Monday.com boards. Easily read Monday.com board data into pandas DataFrames with support for subitems, pagination, and column filtering. Built with the Monday.com GraphQL API.

api-wrapper data-analysis data-integration dataframe graphql monday pandas productivity-tools python

Last synced: 20 Apr 2026

https://github.com/abinashsahoo007/project-bankruptcy-prevention

The project is to create a classification model that predicts the chances of a business facing bankruptcy based on the key feature like Industrial Risk, Management Risk, Financial Flexibility, Credibility, Competitiveness, Operating Risk.

data-analysis data-mining data-visualization deployments eda machine-learning pickle python statistics streamlit

Last synced: 20 Apr 2026

https://github.com/ak-pydev/python_practice

Documenting my learning journey from python -> ML -> DL -> LLM/GenAI -> Agents exercises solved daily from Udemy/Kaggle/YouTube.

data-analysis data-science feature-engineering llms machine-learning mlflow mlops-workflow modeling python3 streamlit uvicorn

Last synced: 20 Apr 2026

https://github.com/inevolin/multivariate-data-analysis

Showcases of modern multivariate & multidimensional data analysis in industrial and high-tech settings.

analytics data-analysis data-science data-visualization javascript

Last synced: 09 Jun 2026

https://github.com/devanshsahu47/talentscape-glassdoor-analysis

TalentScape is an end-to-end Python project that cleans and analyzes a comprehensive Glassdoor Jobs dataset. It features robust data wrangling and 20 insightful visualizations to uncover trends in job titles, salary ranges, company ratings, and more—providing actionable recommendations to optimize recruitment and compensation strategies.

business-intelligence data-analysis data-vizualisation jupyter-notebook python3

Last synced: 15 May 2026

https://github.com/profasem/logistics-performance-analysis

Power BI dashboard analyzing logistics performance, delivery delays, carrier efficiency, and regional risk.

business-intelligence dashboard data-analysis logistics powerbi python supply-chain

Last synced: 21 Apr 2026

https://github.com/docuvesta/la-mer-skincare-chicago-duty-free-analysis

Comparing La Mer product selection, availability and pricing from 3 different purchase locations ✈️

analytics cremedelamer data-analysis data-analytics data-science data-visualization lamer luxury plotly python seaborn skincare

Last synced: 21 Apr 2026

https://github.com/nischay002/us-honey-production-analysis

Analysis of US honey production (1995–2021) using Python & data visualization. Identifies trends in honey yield, pricing, and colony distribution across states.

data-analysis data-visualization exploratory-data-analysis honey-production matplotlib pandas python seaborn us-agriculture

Last synced: 26 Feb 2025

https://github.com/nxion/sql-data-warehouse-project

Building a modern data warehouse with MS SQL server, ETL processes, data modeling and analyitics.

data data-analysis data-analytics data-engineering data-lakehouse data-warehouse datalake datascience etl etl-job medallion-architecture ms mssql sql sql-query sql-server

Last synced: 05 Jun 2026

https://github.com/poglolopez/prueba_tecnica_inlaze

Este repositorio muestra mis habilidades en análisis de datos a través de una prueba técnica para Inlaze. Incluye flujos de trabajo con Python, SQLite y Power BI para analizar el comportamiento de jugadores, depósitos y rendimiento de fuentes de tráfico, destacando eficiencia operativa e información estratégica.

data-analysis data-v etl jupyter powerbi python sqlite

Last synced: 26 Feb 2025

https://github.com/shubhammittal-data/hr_dashboard_tableau

An interactive HR Analytics Dashboard built using Tableau. Provides insights into workforce demographics, hiring trends, salary analysis, and employee records for data-driven decision-making.

chatgpt4 data data-analysis data-visualization drawio-tools faker-generator hr-analytics hr-analytics-dashboard human-resources numpy python tableau tableau-public

Last synced: 17 May 2026

https://github.com/maddieemihle/home_sales

A PySpark-powered analysis of real estate trends using home sales data. This project explores average prices by year, room configuration, and property features, while demonstrating SparkSQL, caching, and partitioning techniques in a scalable data pipeline—all within Google Colab

apache-spark caching data-analysis googlecolab parquet pyspark sparksql

Last synced: 21 Apr 2026

https://github.com/nikhilfuke1/a-b-testing-and-regression-analysis-python

Python Statistical Project involves data analysis, visualization, A/B testing, and regression analysis to determine the best-performing platform.

ab-testing data-analysis hypothesis-testing libraries python regression-analysis statistics visualization

Last synced: 21 Apr 2026

https://github.com/svetlanam/pt-data-analyse

Data analyse of the czech parcel tracking providers

data-analysis matplotlib pandas parcel-tracking python3 visualisation

Last synced: 21 Aug 2025

https://github.com/luminati-io/walmart-dataset-samples

A sample dataset of over 1000 Walmart products, extracted using the Bright Data API, ideal for consumer market insights and competitor analysis.

api data-analysis dataset walmart walmart-scraper web-scraping

Last synced: 04 Jan 2026

https://github.com/danpoynor/data-analysis-spotify-songs-2010-2019

Spotify data analysis for songs between 2010 and 2019 using Jupyter Notebooks including pandas and Seaborn plots.

data-analysis jupyter-notebook matplotlib pandas-dataframe python3 seaborn-plots spotify

Last synced: 22 Apr 2026

https://github.com/tmmvn/analytics-notebooks

A bunch of data analytics notebooks done testing out JetBrains DataLore

ai algorithms data-analysis datalore elements-of-ai helsinki-university-mooc python

Last synced: 22 Apr 2026

https://github.com/kgelli/apple-data-analysis---apache-spark

Modular ETL pipeline for analyzing Apple product purchase patterns using Apache Spark on Databricks with factory design patterns.

apache-spark data-analysis databricks delta-lake etl-pipeline factory-pattern pyspark

Last synced: 22 Apr 2026

https://github.com/prgermux/yield-reporter

This Python application provides a graphical user interface (GUI) for analyzing and visualizing production data from various machines. It uses the PyQt5 framework for the GUI and Matplotlib for plotting data.

automation data-analysis python reporting

Last synced: 22 Apr 2026

https://github.com/rajesh9943/sentiment-analysis-of-consumer-opinions-on-amazon-products

Developed a comprehensive Sentiment Analysis System aimed at classifying Amazon product reviews into positive, neutral, and negative sentiments. The project leveraged advanced Natural Language Processing (NLP) techniques alongside machine learning algorithms to deliver accurate and actionable insights from customer feedback

amazon data-analysis data-manipulation data-preprocessing data-presentation data-visualization machine-learning nlp nlp-library nltk product-reviews-analysis sentiment-analysis sklearn-library word-cloud-generator-in-python-3

Last synced: 05 Jun 2026

https://github.com/ragedunicorn/mantisx-notebook

A repository for Jupyter notebooks analysing mantisx data

data-analysis data-visualization mantis mantisx shooting training

Last synced: 24 Jul 2025

https://github.com/rorrell/lifeexpectancy

A Jupyter Notebook where I create a chart with two line plots on it to check out the life expectancy of men vs. women from 1900-2018

data-analysis data-visualization jupyter-notebook python3

Last synced: 22 Apr 2026

https://github.com/leabrodyheine/california-schools-data-visualization

This front-end project provides interactive visualizations of learning models adopted by California schools during the pandemic. Using D3.js and Mapbox, it dynamically presents data through bar charts, bubble charts, heatmaps, and geographic maps, allowing users to explore trends across school types, sizes, and districts.

d3-visualization d3js data-analysis data-visualization mapbox openai plotly

Last synced: 22 Apr 2026

https://github.com/ayushi-gajendra/restaurant-order-analysis-sql

End-to-end SQL analysis of 12,266 restaurant transactions to identify high-performing menu items, revenue concentration, bulk ordering behavior, and strategic growth opportunities.

analytics-portfolio business-intelligence case-study customer-segmentation data-analysis data-analytics database-analysis menu-engineering mysql revenue-analysis sql sql-project

Last synced: 05 Jun 2026

https://github.com/ayushi-gajendra/buenos-aires-subway-statistics

A comprehensive data analysis of the Buenos Aires subway system ridership using Python and Pandas. This project identifies peak-hour congestion patterns, explores hourly passenger distributions, and utilizes the 95th percentile to isolate extreme traffic conditions for urban mobility insights.

95th-percentile buenos-aires data-analysis data-science-portfolio data-visualization matplotlib pandas python statistical-analysis subway-ridership transit-data urban-mobility

Last synced: 05 Jun 2026

https://github.com/agdturner/ccg-data

A modularised Java library for processing data sets with classes for: data records; collections of data records; and identifiers.

data data-analysis

Last synced: 12 Jan 2026

https://github.com/maddieemihle/pandas-challenge

Python analysis to create and manipulate school and standardized test data. Scores are calculated, grouped, aggregated, summarized, and organized using pandas.

data-analysis pandas-python

Last synced: 09 Jun 2026

https://github.com/al-ogr/sf_pr1_job_analysis_hh

SkillFactory DataScience PROJECT-1. Анализ резюме из HeadHunter

data-analysis data-science ipynb plotly python

Last synced: 23 Apr 2026

https://github.com/syed-nihaal/car-price-prediction-and-performance-analysis

A data science notebook project focused on analyzing car features and building a model for car price prediction.

data data-analysis data-visualization jupyter-notebook python

Last synced: 23 Apr 2026

https://github.com/thc1006/nycu_timtable_crawler

🎓 NYCU Course Data Crawler & Timetable System | 國立陽明交通大學課程爬蟲與選課系統 - Python web scraper for course schedules, syllabi & educational data analysis. Crawls 18K+ courses with 98% success rate. Features: interactive timetable, JSON API, Google Colab support, batch processing, resume capability.

academic course course-selection crawler data-analysis education educational-data google-colab json-api nycu open-data python schedule student-tools syllabus taiwan timetable university web-automation web-scraping

Last synced: 24 Apr 2026

https://github.com/hyperplasma/olympic-visualization-analysis

Multidimensional analysis and visualization of Olympic medals, economy, and happiness index.

data-analysis data-visualization matplotlib numpy pandas python wordcloud

Last synced: 04 May 2026

https://github.com/badranalyst/movie-correlation-analysis-in-python

This project analyzes movie data correlations using Python libraries like Pandas, NumPy, Seaborn, and Matplotlib. It examines relationships between attributes such as ratings, genres, and box office performance to uncover trends that inform recommendations and enhance understanding of movie success factors.

data data-analysis dataset jupyter jupyter-notebook matplotlib matplotlib-pyplot numpy pandas python seaborn

Last synced: 03 May 2026

https://github.com/arunabhagit/inventory-misalignment-and-revenue-loss-in-multi-store-bike-retail

This project focuses on identifying the inventory and demand mismatch causing stagnant sales and lost revenue in a bike retail chain. By analyzing store-level performance and regional customer preferences, the project aims to detect underperforming products.

data-analysis data-visualization powerbi python

Last synced: 24 Apr 2026

https://github.com/datalopes1/bank_marketing

Este projeto será baseado no Dataset Bank Marketing encontrado na UC Irvine - Machine Learning Repository e disponibilizado por S. Moro, R. Laureano e P. Cortez

data-analysis data-science data-visualization eda python

Last synced: 24 Apr 2026

https://github.com/bhavna-kale/cars-eda-project

Project analyzing used car market data to identify high-impact price drivers and depreciation curves, presented through an interactive web application.

data-analysis excel matplotlib numpy pandas python3 searborn streamlit

Last synced: 03 May 2026

https://github.com/yxuco/ethdecoder

This CLI decodes Ethereum transactions and events, stores results in CouchDB, and then exports customized views to CSV files for data visualization and analysis.

data-analysis decoding ethereum

Last synced: 24 Apr 2026

https://github.com/muthukumar0908/youtube-data-harvesting-and-warehousing-using-sql-mongodb-and-streamlit

Create a simple and intuitive user interface using Streamlit, From the youtube getting and extracting the data by using API key. That data stored in database.

data-analysis mongodb-atlas python sqldatabase streamlit-webapp youtube-api

Last synced: 24 Apr 2026

https://github.com/manisharora96/data-analysis-of-smartwatch

The project is structured with sample data, step-by-step Jupyter notebooks, and modular Python scripts for automated analysis

data-analysis data-visualization jupyter-notebook python smartwatch-analysis

Last synced: 24 Apr 2026

https://github.com/obirikan/u.s.-county-commute-data-analysis

This project extracts and analyzes U.S. county-level commuting data from the 2020 American Community Survey (ACS 5-Year Estimates) via the U.S. Census Bureau API.

data-analysis

Last synced: 28 Jun 2025

https://github.com/cyberoctane29/python-for-data-analysis

A repository dedicated to learning Python for data analysis, data science, and data analytics. This collection of Jupyter notebooks covers practical exercises and concepts from the Google Advanced Data Analytics Professional Certificate program.

data data-analysis data-analytics data-science python

Last synced: 24 Apr 2026

https://github.com/edwinrlambert/emomap-sentiment-analysis

To analyze public sentiment related to specific locations in a city (e.g., parks, transit stations, restaurants, neighborhoods) using geo-tagged social media posts, reviews, and comments. The goal is to visualize how people feel across different areas and times.

data-analysis jupyter-notebook python sentiment-analysis

Last synced: 24 Apr 2026

https://github.com/puspacempaka/superstore-analysis-with-sql

This repository showcases various data analyses on the popular Superstore dataset using SQL queries. The analyses cover a range of business insights, including sales performance, customer segmentation, and product profitability. Each analysis is documented with the SQL queries used and explanations of the steps involved.

business-intelligence data-analysis sales-analysis sql superstore-dataset

Last synced: 09 Mar 2026

https://github.com/dimamirana/finding-correlation-among-social-media-usage-depression-sleep

In our project we tried to analysis whether there is a link between depression and social media usage time

anaconda data-analysis jupiter-notebook matplotlib-pyplot patternlab python

Last synced: 03 May 2026

https://github.com/ahmedhosssam/lesser_pandas

Pandas-like Data Analysis library in C++

cpp data-analysis data-science pandas

Last synced: 03 May 2026

https://github.com/monteirooscar98/tarifas-publicas-sp-dieese

Extração de dados através de WebScraping no site do Dieese e Analise em relação as Tarifas Públicas do Município de São Paulo.

data-analysis data-visualization python webscraping

Last synced: 03 May 2026

https://github.com/pedrohdosanjos/economic-data-analysis

This project aims to analyze the export data from various states in the United States to Brazil over time. The data is sourced from the FRED (Federal Reserve Economic Data) API and processed to identify the top 5 exporting states for each year, as well as the states with the highest total export value across all years.

api data-analysis data-visualization jupyter-notebook python

Last synced: 24 Apr 2026

https://github.com/fatihilhan42/tourist_analysis_in_turkey_with_python

In this project, the number of tourists coming to Turkey between 2008-2021 was analyzed. The data from the data set you can find in the warehouse was first organized using data cleaning algorithms. These cleaned data were then output graphically using data visualization algorithms.

data-analysis data-cleaning data-science data-visualization jupyter-notebook python

Last synced: 03 May 2026

https://github.com/ismielabir/pycsvsummarizer

A lightweight tool to summarize CSV files using various features.

csv data-analysis data-summary python

Last synced: 25 Apr 2026

https://github.com/mariann95/sql_data_warehouse_and_analytics_project

Building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics. This repository also contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

data-analysis data-analytics data-cleaning data-engineering data-lakehouse data-science data-science-portfolio data-warehouse data-warehousing datalake datawarehouse datawarehousing etl etl-job etl-pipeline medallion-architecture sql sql-query sql-server sqlserver

Last synced: 06 Jun 2026

https://github.com/mehmetkahya0/gallstone_dataset_analysis_project

Safra Taşı Hastalığı (Gallstone-1) Veri Seti Analizi (https://archive.ics.uci.edu/dataset/1150/gallstone-1)

analysis analytics data data-analysis data-science data-visualization database graph matplotlib python

Last synced: 25 Apr 2026

https://github.com/fbarffmann/belly-button-challenge

Built an interactive JavaScript dashboard to visualize bacterial biodiversity from belly button samples. Analyzed data from 153 participants and identified OTU 1167 as the most common bacteria.

biodiversity dashboard data-analysis data-visualization interactive-charts javascript json plotly

Last synced: 25 Apr 2026

https://github.com/rubix982/product-quality-classification

This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.

data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp

Last synced: 25 Apr 2026

https://github.com/xjwllmsx/hacker-news-engagement

Analyze Hacker News data to reveal which post types and posting hours spark the most discussion, using Python and a reproducible Jupyter notebook.

data data-analysis jupyter python

Last synced: 25 Apr 2026

https://github.com/myriamba/neuraview

AI-Powered Data Insights and Visualization Generator

data-analysis data-engineering data-insights data-visualization generative-ai llm

Last synced: 21 Aug 2025

https://github.com/ddihora1604/iit_patna

A multifaceted project involving applying ML models like Ridge Classifier, RNN, RIDOR, Rotation Forest and RUSBoost, integrating SMOTE for class balancing, and handling diverse datasets including those for seating arrangement tasks.

data-analysis data-visualization datamodelling machine-learning-algorithms python

Last synced: 25 Apr 2026

https://github.com/marielachirinosr/bellabeat-wellness-data-trends

Analyzing smart device data for insights on user activity patterns to optimize interventions for better health outcomes.

data data-analysis data-visualization pandas python python3 tableau tableau-public

Last synced: 25 Apr 2026

https://github.com/karlyndiary/adidas-sales-analysis

Analyzed Adidas' product sales performance, top retailers, monthly trends, yearly growth, regional distribution, and pricing insights. Performed ETL from Python (Pandas) to SQL Server, extracted data with SQL, and visualized key insights in Excel.

adidas-sales-analysis adidas-sales-dashboard dashboard data-analysis data-cleaning data-pipeline data-visualization etl excel-dashboard microsoft-excel microsoft-sql-server python

Last synced: 10 Feb 2026

https://github.com/aastopher/mma_outcome

Simple exploratory analysis of UFC Fights and Vegas fight odds from 1993 to 2021

data-analysis data-visualization

Last synced: 06 Jun 2026

https://github.com/marielachirinosr/hotel-data-analysis

Pandas & Matplotlib Learning Analysis. Repository featuring data analysis projects using Pandas and Matplotlib libraries

data data-analysis matplotlib pandas python

Last synced: 25 Apr 2026

https://github.com/devexpress-examples/winforms-create-a-custom-exporter-for-pivotgridcontrol-with-xtrareport

This example illustrates how to dynamically create a custom report based on PivotGridControl content in WinForms.

data-analysis dotnet pivot-grid pivot-grid-for-winforms winforms

Last synced: 26 Apr 2026

https://github.com/devexpress-examples/wpf-pivotgrid-customize-the-cell-template

This example demonstrates how to customize the cell appearance in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 26 Apr 2026

https://github.com/zients/tw-lottery-recommandation

Taiwan lottery draw analyzer & number recommender with Transformer ML model. Supports 539, 649, 638, 3D, and 4D lotteries.

cli data-analysis lottery machine-learning python pytorch taiwan transformer

Last synced: 03 May 2026

https://github.com/halyusa16/e-commerce-analysis

This project analyzes a public e-commerce dataset to uncover valuable insights and answer critical business questions. The dataset contains customer, product, order, and transaction details, providing a comprehensive view of the e-commerce platform's operations.

data-analysis data-cleaning data-exploration data-visualization self-project

Last synced: 09 Jun 2026

https://github.com/rociobenitez/happiness-index-data-processing

Repository for Big Data Processing - Contains Jupyter Notebooks and Datasets for data analysis and processing tasks related to Big Data.

big-data big-data-processing data-analysis data-processing happiness-index happiness-report jupyter-notebook matplotlib pandas seaborn

Last synced: 15 May 2026

https://github.com/roggersanguzu/weather-medical-expense-prediction-ml-models

This repo contains a model for determining the rainfall patterns and another for medical expense prediction model

data data-analysis data-science datasets joblib machine-learning machine-learning-algorithms scikitlearn-machine-learning

Last synced: 30 Aug 2025

https://github.com/rohitinu6/tesla-price-prediction

A machine learning project that predicts future stock price movements using Logistic Regression, SVC, and XGBoost with engineered financial features.

data-analysis data-visualization feature-engineering financial-analysis logistic-regression machine-learning matplotlib python scikit-learn seaborn stock-market stock-price-prediction support-vector-machine time-series xgboost

Last synced: 03 May 2026

https://github.com/moshora99/sql-data-warehouse-project

Build modern data warehouse with mysql, Including ETL processes, data modeling and analytics

data-analysis data-engineering data-science database datawarehouse datawarehousing etl scheme sql sql-query sql-server

Last synced: 27 Apr 2026

https://github.com/pararang/nams-thesis-fuzzy

A specialized data processing tool designed to help with Fuzzy Delphi Method calculations for thesis research data analysis. Then extended with some new features for data processing with different method.

data-analysis dematel hacktoberfest hacktoberfest-accepted house-of-quality python sustainability vibecoding

Last synced: 27 Apr 2026

https://github.com/arush-codes/paris-olympic-de

data engineering project on paris olympics 2024

azure data-analysis data-engineering microsoft-azure olympics2024 pipeline

Last synced: 27 Apr 2026

https://github.com/ahnaf19/rokomari_price_analysis

This was a job hiring assignment given my rokomari.com. The data was small, obviously a generated one for test purpose. I tried to describe myself while diving deep as much as possible.

data-analysis data-cleaning data-visualization etl

Last synced: 30 Aug 2025

https://github.com/mumtaz4118/amazon-iphone-12-data-scrapped

Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.

data-analysis data-extraction data-science data-scraping html mark-up python

Last synced: 27 Apr 2026

https://github.com/malexandersalazar/covid-19-peru-estimacion-oxigeno-requerido

Análisis técnico de casos confirmados por COVID-19 en Perú para la estimación de oxígeno medicinal requerido.

covid-19 data-analysis data-science peru python

Last synced: 27 Apr 2026

https://github.com/sohamb21/analysis-of-superstore-dataset

I completed the IBM SkillsBuild Data Analytics Internship Program to develop my Data Analytics skills and apply them to a real-world problem by working on this project.

data-analysis python

Last synced: 27 Apr 2026