An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/svetlanam/pt-data-analyse

Data analyse of the czech parcel tracking providers

data-analysis matplotlib pandas parcel-tracking python3 visualisation

Last synced: 21 Aug 2025

https://github.com/beyzabasarir/northwind-traders-analysis

Northwind dataset analysis using PostgreSQL, Python, and Power BI. Focused on sales, customers, shipping, and performance insights.

dashboard data-analysis data-visualization jupyter-notebook matplotlib numpy pandas postgresql powerbi python seaborn

Last synced: 10 Apr 2026

https://github.com/aidan-zamfir/advt-analysis

Web scrapping project. Will eventually use character/episode data for NLP & networking/ data analysis .

data-analysis nlp python selen webscraping

Last synced: 23 Aug 2025

https://github.com/prince-pastakiya/human-resources-tableau-project

👥 Interactive Tableau dashboard for HR analytics — includes workforce overview, demographics, income analysis, and detailed employee records with full filtering.

chatgpt data-analysis data-visualization human-resources numpy python python-faker tableau-dashboards tableau-public

Last synced: 18 Apr 2026

https://github.com/shridhar1504/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation eda etl-framework microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries

Last synced: 09 Mar 2026

https://github.com/pedramjlo/uae_cars_analysis

Analysis of the UAE second-hand car data

data-analysis jupyter-notebook pandas python sql sqlite3

Last synced: 11 May 2026

https://github.com/0xnu/data-analyst-training

The repository contains training materials for data analysts.

data data-analysis data-analyst

Last synced: 25 Aug 2025

https://github.com/debjyotisaha/tableau-projects-phase-2

Published interactive dashboards on Tableau Public, highlighting expertise in data visualization and storytelling through analyses of transportation patterns, sales trends, and demographic studies. These projects showcase the ability to transform complex datasets into actionable, intuitive visuals for decision-making.

dashboards data data-analysis data-visualisation tableau

Last synced: 26 Aug 2025

https://github.com/putuwaw/dashboard-ecommerce

Dashboard for E-Commerce Public Dataset using Streamlit and Plotly

dashboard data-analysis dicoding plotly streamlit

Last synced: 20 Feb 2026

https://github.com/sarathchandranpm/walmart-sales-analysis

Analysis of Walmart Myanmar's Q1 2019 sales data covering customer behavior, product performance, general operations, and sales patterns.

data-analysis mysql sql

Last synced: 29 Aug 2025

https://github.com/lauratrigo/fft_matlab

📡Análise de Fourier para Dados Ionosféricos é um script MATLAB que aplica FFT para gerar espectros unilaterais e bilaterais de parâmetros ionosféricos (hF, f0F2, hmF2), identificando periodicidades e comparando assinaturas espectrais com resolução de 15 minutos, útil para estudos de variações e distúrbios ionosféricos.

data-analysis fast-fourier-transform fft fourier ionosphere matlab scientific scientific-initiation

Last synced: 29 Aug 2025

https://github.com/roggersanguzu/weather-medical-expense-prediction-ml-models

This repo contains a model for determining the rainfall patterns and another for medical expense prediction model

data data-analysis data-science datasets joblib machine-learning machine-learning-algorithms scikitlearn-machine-learning

Last synced: 30 Aug 2025

https://github.com/karlyndiary/adidas-sales-analysis

Analyzed Adidas' product sales performance, top retailers, monthly trends, yearly growth, regional distribution, and pricing insights. Performed ETL from Python (Pandas) to SQL Server, extracted data with SQL, and visualized key insights in Excel.

adidas-sales-analysis adidas-sales-dashboard dashboard data-analysis data-cleaning data-pipeline data-visualization etl excel-dashboard microsoft-excel microsoft-sql-server python

Last synced: 10 Feb 2026

https://github.com/obirikan/u.s.-county-commute-data-analysis

This project extracts and analyzes U.S. county-level commuting data from the 2020 American Community Survey (ACS 5-Year Estimates) via the U.S. Census Bureau API.

data-analysis

Last synced: 28 Jun 2025

https://github.com/luminati-io/walmart-dataset-samples

A sample dataset of over 1000 Walmart products, extracted using the Bright Data API, ideal for consumer market insights and competitor analysis.

api data-analysis dataset walmart walmart-scraper web-scraping

Last synced: 04 Jan 2026

https://github.com/shubhammittal-data/hr_dashboard_tableau

An interactive HR Analytics Dashboard built using Tableau. Provides insights into workforce demographics, hiring trends, salary analysis, and employee records for data-driven decision-making.

chatgpt4 data data-analysis data-visualization drawio-tools faker-generator hr-analytics hr-analytics-dashboard human-resources numpy python tableau tableau-public

Last synced: 17 May 2026

https://github.com/nischay002/us-honey-production-analysis

Analysis of US honey production (1995–2021) using Python & data visualization. Identifies trends in honey yield, pricing, and colony distribution across states.

data-analysis data-visualization exploratory-data-analysis honey-production matplotlib pandas python seaborn us-agriculture

Last synced: 26 Feb 2025

https://github.com/singhs05/global-youtube-trends

Understand the impact of Likes, comments, dislikes on the video consumption for the videos that were trending.

data-analysis mssqlserver query sql

Last synced: 18 Mar 2026

https://github.com/luminati-io/target-dataset-samples

A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.

api data-analysis data-mining datasets target web-scraper web-scraping

Last synced: 04 Jan 2026

https://github.com/mehrab-kalantari/olympics-data-analysis

A streamlit application to analyze the Olympics dataset from several views

data-analysis streamlit-dashboard streamlit-webapp

Last synced: 20 Apr 2026

https://github.com/leandrocollares/home-team-advantage-in-epl

Home team advantage in the English Premier League: an exploratory data analysis

data-analysis matplotlib pandas plotly

Last synced: 11 Jun 2026

https://github.com/mysftz/statistical-analysis

A in-depth review of statistical analysis in Python from datasets.

data-analysis python python3 statistics university university-project

Last synced: 14 May 2025

https://github.com/als8446/tripleten-data-science-projects

Projects Overview Projects made in the Data Scientist course from TripleTen LatAm

data data-analysis hypothesis-tests machine matplotlib numpy pandas python scipy sklearn

Last synced: 10 Apr 2026

https://github.com/iness000/online-retail-customer-segmentation

This project performs comprehensive customer segmentation analysis on an online retail dataset using machine learning clustering techniques and RFM (Recency, Frequency, Monetary) analysis. The goal is to identify distinct customer segments to drive better customer relationship management strategies and business insights.

customer-segmentation data-analysis k-means

Last synced: 31 Aug 2025

https://github.com/rdrahul123/ecommerce-sales-dashboard

This project focuses on analyzing e-commerce sales data to uncover actionable insights and improve business decision-making. Using interactive dashboards and data analysis techniques, the project evaluates key performance metrics, customer behavior, sales trends, and payment modes across different categories and regions.

data-analysis data-science excel powerbi

Last synced: 22 Mar 2025

https://github.com/jayqi/data-analysis-tools

Presentation on Data Analysis Tools

data-analysis presentation-slides

Last synced: 06 Jan 2026

https://github.com/pranjalya/hand-washing-data-visualisation

A small project of Data Visualization, where we analyze the effect of hand washing after introduced by Dr. Semmelweis to the nurses and midwives after giving birth.

data-analysis data-visualization jupyter-notebook pandas python3

Last synced: 06 May 2026

https://github.com/amoneva/cacc

An R Package to compute Conjunctive Analysis of Case Configurations (CACC), Situational Clustering Tests, and Main Effects

criminology data-analysis r social-science

Last synced: 15 May 2025

https://github.com/virajbhutada/diamond-price-estimator

This project develops a predictive model to estimate diamond prices based on characteristics like carat, cut, color, and clarity. It covers data preprocessing, feature engineering, model selection, training, and evaluation. The final product is a web app where users can input diamond attributes to get accurate and instant price predictions.

cross-validation css data-analysis data-science-projects data-visualization eda feature-engineering html hyperparameter-tuning jupyter-notebooks machine-learning ml-algorithms model-deployment model-selection performance-optimization predictive-modeling python python-app user-interface

Last synced: 14 Apr 2026

https://github.com/lanzafame/polycarp

[WIP] Subset operations on latlon data read from CSVs

data-analysis geospatial wip

Last synced: 12 Jan 2026

https://github.com/anuppm9917/super-store-sales-analysis-power-bi-project

My drive to know which products, regions, categories and customer segments a company should target or avoid, I search and selected an appropriate dataset on kaggle which will match a standard superstore requirement.

data data-analysis data-visualization datacleansing excel exploratory-data-analysis jupyter-notebook numpy pandas plotly powerbi python3

Last synced: 10 Apr 2026

https://github.com/hi-jin2/data-analysis-basics

데이터분석기초(R) 수업 중에 작성한 소스코드 모음입니다. 『모두를 위한 R 데이터 분석 입문』 교재를 통해 R언어를 학습하였습니다.

data-analysis r r-studio

Last synced: 19 Jul 2025

https://github.com/moenessgannouni/englandweather

A mini-project that analyzes weather data in England usingLinear Regression and Multiple Linear Regression. Ideal for learning and applying statistical analysis and predictive modeling.

data-analysis data-visualization linear-regression multiple-linear-regression rprogramming

Last synced: 22 Mar 2025

https://github.com/akmj1011/hill-and-valley-prediction-using-logistic-regression

Created A Prediction System Using Logistic Regression For Figuring Out The Hall And Valley From The Given Datasets

cloud-computing data-analysis data-manipulation data-preprocessing data-transformation data-visualization google-colab

Last synced: 13 May 2026

https://github.com/haideratgh/sql-data-analytics-project

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis

analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-query sql-server window-functions-in-sql

Last synced: 29 Jun 2025

https://github.com/satyam4229/omnify-dataanalysis

Our assessment of Omnify focused on data-driven strategies to maximize profitability. We identified "Product X" as the most profitable product and recommended leveraging the "Wellness Solutions" keyword category for optimal keyword strategy.

data-analysis data-science data-visualization excel omnify

Last synced: 04 Jan 2026

https://github.com/serlo/data-pipeline-interactive-exercises

processing pipeline for exercise dashboards

data-analysis serlo

Last synced: 26 Feb 2025

https://github.com/okwilkins/retailanalysis

A comprehensive exploratory analysis and implementation of kmeans/hierarchical clustering on online retail data.

data-analysis data-science machine-learning statistics

Last synced: 18 Oct 2025

https://github.com/badranalyst/titanic-survival-prediction-full-data-science-project-classification

This project predicts Titanic survivors using classification models. It includes data cleaning, pre-processing, exploratory data analysis (EDA), categorical feature conversion, model building, and evaluation. Python libraries like Pandas, NumPy, Matplotlib, and Seaborn are used to analyze and predict survival outcomes.

classification data-analysis data-science eda exploratory-data-analysis machine-learning matplo matplotlib-pyplot ml model numpy pandas predictive-modeling python seaborn

Last synced: 06 May 2026

https://github.com/atanikan/data-mining-projects

Data Mining Homework

data-analysis iub

Last synced: 14 Mar 2025

https://github.com/azaz9026/data_cleaning

Welcome to the Data Cleaning repository! This collection is dedicated to showcasing techniques and methods for cleaning and preparing datasets for analysis.

data-analysis data-engineering data-structures data-visualization eda feature-engineering machine-learning numpy outliers pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/sanchittechnogeek/overscripted-analysis

Geolocation and user language extraction analysis from Mozilla Overscripted dataset

analysis data data-analysis mozilla

Last synced: 23 Mar 2025

https://github.com/bibymaths/python_snippets

A collection of Python scripts for bioinformatics data analysis, including tools for transcription counts, nucleotide composition, and protein sequence evaluation.

amino-acid-scoring bioinformatics data-analysis fasta-generation mathematical-evaluation nucleotide-analysis protein-sequence-analysis transcription-counts

Last synced: 29 Jul 2025

https://github.com/nevermendel/revolut-analysis

Python script to analyse Revolut transactions

data-analysis revolut revolut-analysis

Last synced: 12 Apr 2025

https://github.com/ssoehdata/sql_for_data_science_specialization_course

Materials and Certifications from the SQL for DataScience Course

data-analysis data-science database databricks postgresql sql sqlite

Last synced: 10 Apr 2026

https://github.com/lopez86/rust-mlearn

Machine Learning Tools in Rust

data-analysis data-science machine-learning rust

Last synced: 15 May 2025

https://github.com/farhad-here/data-visualization-analysis-dva

This is my data analysis project. Users can use this project to clean and preprocessing the date or data visualization. Individuals can impute or ecnode ther dataset.

altair bokeh data-analysis data-analysis-python io matplotlib numpy pandas plotly python sklearn streamlit

Last synced: 11 Apr 2026

https://github.com/antoniszks/music-category-identifier

A 'Data-Science & Machine Learning' project where we are training a neural network to identify what kind of music we give to it. Based on a university project.

ai artificial-intelligence data-analysis data-science jupyter-notebook machine-learning ml notebook python

Last synced: 25 Feb 2025

https://github.com/motapinto/agent-based-simulation-conquest

Agent-based simulation modelation of the conquest Battlefield gamemode

agent-based-simulation data-analysis jade java sajas swing

Last synced: 24 Jan 2026

https://github.com/shubham200137/cyclistic-case-study

This repository contains a case study for Google's Data Analytics Professional Certificate, focusing on Cyclistic, a fictional bike sharing company in Chicago. The case study aims to drive growth by converting casual riders into members through a marketing strategy.

data-analysis data-visualization numpy-python pandas-python presentation-slides sql tableau

Last synced: 11 Jun 2026

https://github.com/grlyntng/rpims

Django Code and documentation for the Retail Pharmacy Inventory Management System (best final year project award)

data-analysis django erp forecasting-models lstm-neural-networks reporting

Last synced: 26 May 2026

https://github.com/jayita11/healthcare-management-optimization-analysis-and-visualization

This project analyzes healthcare data from 2019 to May 2024, optimizing patient care, resource allocation, and financial management. Insights include billing trends, blood bank management, doctor performance, and medication demand, supported by excel,interactive Tableau dashboards and SQL analysis.

data-analysis excel healthcare interactive-dashboards mysql sql tableau-dashboards

Last synced: 23 Mar 2025

https://github.com/shimaa83/eda_v2

Automatic EDA library

data-analysis data-science python

Last synced: 20 Apr 2026

https://github.com/tashi-2004/apache-hadoop-spark-hive-cyberanalytics

This project utilizes Apache Hadoop, Hive, and PySpark to process and analyze the UNSW-NB15 dataset, enabling advanced query analysis, machine learning modeling, and visualization. The project demonstrates efficient data ingestion, processing, and predictive analytics for network security insights.

ai apache-hadoop apache-hive big-data-analytics big-data-processing data-analysis data-engineering data-science data-security data-visualization hdfs machine-learning network-analysis network-security pyspark python3 threat-detection unsw-nb15-dataset

Last synced: 02 May 2026

https://github.com/ttwag/p9_pandas

Problems that Introduce the DataFrame Object in Python's Pandas Library

data-analysis pandas-dataframe python

Last synced: 10 Jun 2025

https://github.com/tenifayo/analysis-of-fordgobike-trip-data

Data Visualization using Ford GoBike Trip Data

data-analysis matplotlib pandas

Last synced: 11 Jul 2025

https://github.com/jasontan22/aefes-time-series-forecasting

Bu proje, Anadolu Efes Biracılık ve Malt Sanayii A.Ş. (AEFES) piyasa verilerini kullanarak kapanış fiyatlarının gelecekteki değerlerini tahmin etmek amacıyla derin öğrenme yöntemleri (LSTM, BiLSTM, CNN+LSTM) kullanmaktadır. Projede, veri ön işleme, model eğitimi ve değerlendirme adımları detaylandırılmıştır.

bilstm cnn-lstm data-analysis deep-learning financial-forecasting lstm machine-learning python stock-price-prediction tensorflow

Last synced: 09 Aug 2025

https://github.com/dug22/jjournal

A Jupyter like notebook software for Java

data data-analysis data-science java jshell jshell-repl notebook swing swing-application

Last synced: 11 Apr 2026

https://github.com/bala-1409/power-bi-visualization-project

This repository contains Visualization Projects which is visualized through Power BI Software, by using the visualization we can gain multiple insights and strategies which helps to develop the business for gaining high profit margins and by the insights we can reduce the damages by accidents & calamities.

dashboard data-analysis data-science data-visualization exploratory-data-analysis microsoft-excel microsoft-power-bi microsoft-powerpoint power-bi powerbi powerbi-reports powerbi-visuals visualization

Last synced: 04 Jan 2026

https://github.com/neha-adnani/sql_music-store-analysis

SQL-based data analysis of a digital music store's sales and customer data.

business-analysis data data-analysis database follow-along-projects pgadmin4 portfolio-project postgres queries sql

Last synced: 18 Jun 2025

https://github.com/abhay-sinha-0/carpricepredictionproject

A machine learning project that predicts the selling price of a car based on its features such as year, mileage, fuel type, transmission, and more. This model can assist individuals and dealerships in estimating fair market prices for used cars.

artificial-intelligence data-analysis data-science data-visualization exploratory-data-analysis machine-learning-algorithms matplotlib-pyplot mysql-database numpy-library pandas-library python skit-learn sklearn-library

Last synced: 15 May 2025

https://github.com/emcramer/clockplot

Plotting utility for a "clockplot" that puts groups into a time-ordered heterogeneity visualization

biology data-analysis data-visualization heterogeneity pseudotemporal-ordering

Last synced: 10 Mar 2026

https://github.com/regmibijay/opencarp-analyzer

Reads Trace Files created by OpenCARP Models and exports data for easy plotting with inbuilt plotter script.

bioinformatics data-analysis opencarp

Last synced: 16 Jan 2026

https://github.com/27ahmad/netflix_sql_project

The Netflix SQL Project analyzes the Netflix dataset using SQL queries to gain insights into its content, identify trends, and address business problems related to movies and TV shows.

data-analysis postgresql-database sql

Last synced: 03 Feb 2026

https://github.com/27ahmad/ibm-data-science-capstone

The Capstone is the final course in the IBM Data Science Professional Certificate program. It's a project that combines all the skills and knowledge you've gained throughout the specialization.

data-analysis data-science folium-maps machine-learning plotly-dash python sql

Last synced: 26 May 2026

https://github.com/zulfachafidz/titanic_explorer_predicting_survival_with_classification_using_knn_algorithm

Tracking Life Safety with the KNN Predictive Analysis Approach. Leveraging the Titanic Dataset, we apply classification analysis to predict the fate of passengers based on a variety of features.

algorithm algorithms data data-analysis data-mining data-science datamodeling datapreprocessing dataset knn-algorithm knn-classification machine-learning machine-learning-algorithms prediction-model

Last synced: 01 Sep 2025

https://github.com/malucor/livros

Programa em Python para fazer uma análise de dados sobre livros, a partir de um arquivo Excel.

analise-de-dados book books bookshelf data-analysis ipynb jupyter-notebook livro livros python

Last synced: 16 May 2026

https://github.com/gaaniruddha/mphil

This repository contains a copy of my final MPhil presentation and panel report.

data-analysis gpu-imager radio-astronomy

Last synced: 03 Mar 2026

https://github.com/jatin-s16/hr_mysql_powerbi

This repository contains raw HR data along with key business questions. I performed data cleaning using MySQL queries and wrote analytical queries to extract meaningful insights. The results were then visualised using Power BI to enhance business understanding.

data-analysis data-science data-visualization mysql powerbi

Last synced: 29 May 2026

https://github.com/jedrzej-wydra/data-analysis-associate

Associate Data Analyst Exam by DataCamp

data-analysis datacamp r

Last synced: 23 Mar 2025

https://github.com/chanmeng666/douban-review-scraper

【One star = One happy developer doing a little dance 💃⭐️】A robust Python scraper for collecting and analyzing movie reviews from Douban.com, featuring comprehensive data processing and analysis capabilities.

beautifulsoup4 data-analysis data-processing douban movie-reviews pandas python sentiment-analysis text-mining web-scraping

Last synced: 02 May 2026

https://github.com/apfirebolt/numpy-and-pandas-examples

Some examples and sample datasets to learn numpy, pandas and other data science libraries in Python

data-analysis jupyter-notebook numpy pandas python

Last synced: 17 Apr 2026

https://github.com/bhaskaracharjee/student-results-analysis

Analyzing student results to uncover insights

data-analysis student-results

Last synced: 16 May 2025

https://github.com/first-coding/smart_analysis

Smart Analysis is an AI-powered data analysis tool that leverages large language models (LLMs) to generate SQL queries from natural language prompts. Upload CSV files, explore the data schema, and retrieve insights with ease. The system ensures error correction in SQL queries, delivering detailed reports and visualizations in a streamlined workflow

data-analysis llm openai prompt-engineering python

Last synced: 08 Mar 2025

https://github.com/sun-lab-nbb/sl-shared-assets

A Python library that stores assets shared between multiple Sun (NeuroAI) lab data acquisition and processing repositories.

data-analysis data-collection data-processing experiment sunlab

Last synced: 10 Mar 2026

https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql

Last synced: 19 May 2026

https://github.com/navp7/roadaccident_powerbi

An interactive Power BI dashboard designed to analyze road accident data

dashboards data-analysis data-visualization powerbi

Last synced: 19 Mar 2026

https://github.com/totonga/ods-exd-api-box

Helper package to build ASAM ODS EXD API grpc plugins.

asam data-analysis grpc grpc-server ods plugin python

Last synced: 03 Feb 2026