An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/nischay002/us-honey-production-analysis

Analysis of US honey production (1995–2021) using Python & data visualization. Identifies trends in honey yield, pricing, and colony distribution across states.

data-analysis data-visualization exploratory-data-analysis honey-production matplotlib pandas python seaborn us-agriculture

Last synced: 26 Feb 2025

https://github.com/sadratehranian/data-collection-and-machine-learning

create a model using logistic regression to predict whether the fire alarm of a smoke detector should sound or not. Second, predicts whether an electric drive in a production plant may be faulty or not.

data data-analysis data-science datacollection logistic-regression machine-learning ml nn

Last synced: 05 Jan 2026

https://github.com/saob007/tablero_subsidios_servicio_agua

Se construye un dashboard para el análisis de la distribución y asignación de subsidios para agua potable y alcantarillado otorgados por la Secretaría de Planeación de la Alcaldía de Sincelejo en 2020, con el objetivo de identificar patrones en cobertura, consumo, facturación y subsidios, facilitando la toma de decisiones en políticas públicas

dashboard data-analysis data-visualization looker-studio

Last synced: 31 Jan 2026

https://github.com/soypete/example-go-dataframes-parser

example of https://godoc.org/github.com/kniren/gota/dataframe

data-analysis data-science datastructures golang-examples ml

Last synced: 12 Sep 2025

https://github.com/kathisnehith/austin-crime-report-analysis

Data analysis and visualization of crime trends in Austin

crime-reporting data-analysis data-visual database reporting sql tableau

Last synced: 25 Feb 2026

https://github.com/jofaval/boston-housing

Regression Analysis into the Boston Housing in-demand pricing in 1978

boston-housing data-analysis data-science data-visualization machine-learning python regression

Last synced: 16 May 2026

https://github.com/leandrocollares/home-team-advantage-in-epl

Home team advantage in the English Premier League: an exploratory data analysis

data-analysis matplotlib pandas plotly

Last synced: 11 Jun 2026

https://github.com/nmelgar/birthday_sports_dataviz

We will analyze how the Matthew Effect has influenced in professional sports players.

analysis csv data data-analysis data-science data-visualization datavisualization dataviz probability research tableau

Last synced: 08 Jan 2026

https://github.com/nikbarb810/motif_detection_in_r

Motif Detection for TFBS in Glycolysis and Glyconeogenesis pathways

bioinformatics data-analysis null-hypothesis pwm r

Last synced: 23 Jun 2025

https://github.com/robinmillford/cardiac-care-performance-dashboard

This project presents a comprehensive data analysis and interactive dashboard focused on Cardiac Surgery and Percutaneous Coronary Interventions (PCI) performance by hospital, spanning from 2008 onwards.

cardiac data-analysis data-visualization plotly-express streamlit-dashboard tableau tableau-public

Last synced: 07 Sep 2025

https://github.com/silvermete0r/sdu_hackathon_uss_db_analysis

Smart Data Ukimet Hackathon - "Data Modeling" case Solution - Topic: Store Analysis based on Unified Star Schema

data-analysis data-modeling postgresql python sql unified-star-schema

Last synced: 14 Apr 2026

https://github.com/rociobenitez/airbnb-data-mining

Análisis detallado y modelado predictivo de alojamientos en Madrid utilizando técnicas de Big Data y estadística en R, enfocado en optimización de datos y predicción de características de propiedades.

airbnb data-analysis data-mining estadistica prediction-model predictive-analytics predictive-modeling qmd r rstudio

Last synced: 23 Jun 2025

https://github.com/jayqi/data-analysis-tools

Presentation on Data Analysis Tools

data-analysis presentation-slides

Last synced: 06 Jan 2026

https://github.com/pranjalya/hand-washing-data-visualisation

A small project of Data Visualization, where we analyze the effect of hand washing after introduced by Dr. Semmelweis to the nurses and midwives after giving birth.

data-analysis data-visualization jupyter-notebook pandas python3

Last synced: 06 May 2026

https://github.com/jpgiant/training_project

Analyzing whether there is a difference between the average death ages of left handers and right handers using Bayesian Conditional Probability Theorem.

bayesian-statistics data-analysis data-visualization numpy pandas-dataframe python

Last synced: 30 Apr 2026

https://github.com/zimmi48/nixpkgs-issues

Analysis on nixpkgs issue lifetime.

data-analysis github-api nixpkgs

Last synced: 10 May 2026

https://github.com/auliannee/customer-analysis-with-tableau

This repository contains the data source and the tableau workbook.

data-analysis data-visualization tableau

Last synced: 12 Mar 2026

https://github.com/virajbhutada/diamond-price-estimator

This project develops a predictive model to estimate diamond prices based on characteristics like carat, cut, color, and clarity. It covers data preprocessing, feature engineering, model selection, training, and evaluation. The final product is a web app where users can input diamond attributes to get accurate and instant price predictions.

cross-validation css data-analysis data-science-projects data-visualization eda feature-engineering html hyperparameter-tuning jupyter-notebooks machine-learning ml-algorithms model-deployment model-selection performance-optimization predictive-modeling python python-app user-interface

Last synced: 14 Apr 2026

https://github.com/as16082023/motor-vehicle-thefts

Using SQL to analyze vehicle theft patterns across New Zealand, focusing on trends related to specific times and locations.

data-analysis mysql sql

Last synced: 10 Apr 2025

https://github.com/satvikpraveen/pandasplayground

📊 A comprehensive pandas mastery project with 10 modular Jupyter notebooks covering data loading, cleaning, grouping, merging, time series, visualization, and performance profiling. Includes real-world workflows, Docker, Streamlit, and reusable utils. Ideal for data scientists and analysts to learn, practice, and refer. Practice-ready and modular.

analytics cheatsheet data-analysis data-cleaning data-pipeline data-science data-visualization docker etl exploratory-data-analysis jupyter-notebook jupyterlab learning-resource memory-profiling open-source pandas performance-tuning python streamlit time-series

Last synced: 10 Apr 2026

https://github.com/hi-jin2/data-analysis-basics

데이터분석기초(R) 수업 중에 작성한 소스코드 모음입니다. 『모두를 위한 R 데이터 분석 입문』 교재를 통해 R언어를 학습하였습니다.

data-analysis r r-studio

Last synced: 19 Jul 2025

https://github.com/abidshafee/google.colaboratory_projects

This repository contains the collections of interactive python notebooks (ipynb) that are some of my projects on Data Science, Machine Learning (ML), and Natural Language Processing (NLP).

colaboratory data-analysis data-science lstm machine-learning nlp statistics time-series

Last synced: 09 Jul 2025

https://github.com/kailenroa/sleep-efficiency-project

This project focuses on analyzing sleep efficiency using wearable technology data. It explores patterns in sleep behavior and key factors impacting sleep quality. A dashboard was created using phyton and data visualization tools to provide actionable insights and recommendations for improving sleep health.

dashboard data-analysis html phyton sleep-efficiency

Last synced: 06 Jan 2026

https://github.com/moenessgannouni/englandweather

A mini-project that analyzes weather data in England usingLinear Regression and Multiple Linear Regression. Ideal for learning and applying statistical analysis and predictive modeling.

data-analysis data-visualization linear-regression multiple-linear-regression rprogramming

Last synced: 22 Mar 2025

https://github.com/hevalhazalkurt/word_analyser

A web app developed in Python and Django that analyzes given text mathematically and sentimentally.

analyzer analyzes content data-analysis django emotion python python3 sentiment sentiment-analyser sentiment-analysis text text-analysis

Last synced: 19 May 2026

https://github.com/idb-devs/dataanalyticsairbnb

Construir um modelo de previsão de preço que permita uma pessoa comum que possui um imóvel possa saber quanto deve cobrar pela diária do seu imóvel.

data-analysis data-science jupyter python

Last synced: 18 Apr 2026

https://github.com/juanmerino89/data-job-market-analysis-project

Análisis completo del mercado laboral a través de datos abiertos, scraping y visualizaciones. Proyecto explicado paso a paso en mi canal de YouTube.

career-insights data-analysis data-science job-data job-market jupyter-notebook machine-learning market-trends open-data portfolio-project python salary-analysis visualization web-scraping youtube-project

Last synced: 18 May 2026

https://github.com/farhad-here/median-performance-comparison

Benchmarking the performance of median calculation using vanilla Python vs NumPy.

data-analysis matplotlib numpy python

Last synced: 18 Apr 2026

https://github.com/iamsainikhil/us-births-analysis

Analysis of US-Births during 1994-2003 based on CDC-NCHS data set.

data-analysis python

Last synced: 16 May 2026

https://github.com/mstovarh/analisis-de-bebidas-de-starbucks

En este repositorio se encuentran unas gráficas basadas en diversas características de las bebidas de Starbucks, usé tecnologías como la herramienta de Data Analysis de ChatGPT, Excel y PowerQuery.

chatgpt data-analysis excel powerquery

Last synced: 15 Apr 2025

https://github.com/satyam4229/omnify-dataanalysis

Our assessment of Omnify focused on data-driven strategies to maximize profitability. We identified "Product X" as the most profitable product and recommended leveraging the "Wellness Solutions" keyword category for optimal keyword strategy.

data-analysis data-science data-visualization excel omnify

Last synced: 04 Jan 2026

https://github.com/skysign/dat

데이터분석을 함께 공부하는 스터디입니다.

data data-analysis data-science

Last synced: 02 Jan 2026

https://github.com/ronitjariwala/prodigy_ds_02

Prodigy InfoTech Data Science Internship Task-2

data-analysis python

Last synced: 28 Apr 2026

https://github.com/vipulbunny/ml-learning_projects

A collection of machine learning projects implemented in Python, showcasing core concepts like regression, classification, clustering, and model evaluation techniques. Ideal for learners and data science enthusiasts.

classification clustering data-analysis data-science data-visualization decision-trees jupyter-notebook machine-learning model-evaluation random-forest regression supervised-learning unsupervised-learning

Last synced: 23 Jul 2025

https://github.com/kaoutarmi/analyse-des-ventes-pour-optimiser-la-performance

Analyse des données de ventes pour identifier des opportunités d'amélioration des performances commerciales. Utilisation de Pandas pour le traitement des données, et Matplotlib/Seaborn pour la visualisation des tendances et des résultats.

business-intelligence data-analysis data-visualization jupyter-notebook matplotlib pandas sales-optimization seaborn

Last synced: 01 Jul 2026

https://github.com/priyanshubiswas-tech/deloitte-daikibo-telemetry-analysis-task-1

Tableau dashboard analyzing Daikibo telemetry data. Tracks downtime by factory/device with interactive filters. Deloitte task solution with JSON processing.

data-analysis data-visualization deloitte json tableau tableau-public

Last synced: 11 Oct 2025

https://github.com/wwgolay/hr1099-timelapse-vlbi

The repository for HR1099 timelapse VLBI.

astronomy astrophysics data-analysis website

Last synced: 03 Apr 2025

https://github.com/galal-pic/advanced_regression

A project to predict house prices through machine learning different techniques

data-analysis data-science deep-learning feature-engineering flask machine-learning python regression

Last synced: 08 Jul 2025

https://github.com/jayita11/eda-student-exam-performance

This project performs Exploratory Data Analysis (EDA) and hypothesis testing on student performance data. It explores trends based on attributes like gender, race/ethnicity, parental education, lunch type, and test preparation course completion.

data-analysis eda hypothesis-testing matplotlib pandas python seaborn statsmodels student-performance-analysis

Last synced: 11 Jul 2025

https://github.com/firetyrant/sql-portfolio-projects

Documenting my SQL learning journey with hands-on projects focused on data cleaning, analysis, and optimization.

bigquery data-analysis databases etl learning portfolio query-optimization sql

Last synced: 19 Apr 2026

https://github.com/xre22zax/airline-analysis

Travel agency and need to know the ins and outs of airline prices for your clients

data-analysis data-visualization python python3 visualization

Last synced: 13 Apr 2026

https://github.com/azaz9026/data_cleaning

Welcome to the Data Cleaning repository! This collection is dedicated to showcasing techniques and methods for cleaning and preparing datasets for analysis.

data-analysis data-engineering data-structures data-visualization eda feature-engineering machine-learning numpy outliers pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/sanchittechnogeek/overscripted-analysis

Geolocation and user language extraction analysis from Mozilla Overscripted dataset

analysis data data-analysis mozilla

Last synced: 23 Mar 2025

https://github.com/bibymaths/python_snippets

A collection of Python scripts for bioinformatics data analysis, including tools for transcription counts, nucleotide composition, and protein sequence evaluation.

amino-acid-scoring bioinformatics data-analysis fasta-generation mathematical-evaluation nucleotide-analysis protein-sequence-analysis transcription-counts

Last synced: 29 Jul 2025

https://github.com/nevermendel/revolut-analysis

Python script to analyse Revolut transactions

data-analysis revolut revolut-analysis

Last synced: 12 Apr 2025

https://github.com/satyam4229/prediction-of-different-diseases

Prediction of the different diseases with the help of different symptoms express the diseases in the real time. In the dataset, there are 132+ different symptoms on which the model is trained to give the best result of the disease.

data-analysis data-science data-visualization jupyter-notebook kaggle python

Last synced: 13 Apr 2026

https://github.com/rainbowatcher/simple

Make data work easier, saving your working time

bigdata data-analysis etl

Last synced: 10 Apr 2025

https://github.com/laudebugs/fec-data-analysis-2020

The project aimed to determine the total sum of contributions to the candidate committees as well as the number of contributions made by individuals.

data-analysis fec presidential-candidates

Last synced: 16 May 2026

https://github.com/lopez86/rust-mlearn

Machine Learning Tools in Rust

data-analysis data-science machine-learning rust

Last synced: 15 May 2025

https://github.com/antoniszks/music-category-identifier

A 'Data-Science & Machine Learning' project where we are training a neural network to identify what kind of music we give to it. Based on a university project.

ai artificial-intelligence data-analysis data-science jupyter-notebook machine-learning ml notebook python

Last synced: 25 Feb 2025

https://github.com/motapinto/agent-based-simulation-conquest

Agent-based simulation modelation of the conquest Battlefield gamemode

agent-based-simulation data-analysis jade java sajas swing

Last synced: 24 Jan 2026

https://github.com/shubham200137/cyclistic-case-study

This repository contains a case study for Google's Data Analytics Professional Certificate, focusing on Cyclistic, a fictional bike sharing company in Chicago. The case study aims to drive growth by converting casual riders into members through a marketing strategy.

data-analysis data-visualization numpy-python pandas-python presentation-slides sql tableau

Last synced: 11 Jun 2026

https://github.com/balajimohan18/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataeset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization eda kmeans-clustering machine-learning sql unsupervised-learning

Last synced: 27 Jul 2025

https://github.com/zborovskaanna/dou-salary-analysis

Python data analysis project focused on improving data manipulation skills using Pandas

data-analysis pandas python

Last synced: 26 Feb 2025

https://github.com/grindelfp/two-data-manipulative-tasks

Two simple tasks on data analysis and processing.

data-analysis ipynb mlda

Last synced: 17 Feb 2026

https://github.com/weybsonalves/prevendo-o-atrito-de-clientes

Projeto em que percorro as etapas que compõem o ciclo de vida da ciência de dados a fim de prever o atrito de clientes do serviço de cartões de crédito de um banco.

data-analysis data-science data-visualization machine-learning python

Last synced: 06 May 2026

https://github.com/elakkiya-u/digital-marketing-campaign

A machine learning project to predict whether a customer will convert based on digital marketing campaign data.

campaigns data-analysis deployment digital-marketing machine-learning predictive-modeling python

Last synced: 30 Jun 2025

https://github.com/apsinghanalytics/hranalytics_myersbriggspersonalityinsights

A Excel analytics study exploring the correlation between personality traits and key HR-relevant parameters, including tenure and performance

data-analysis data-visualization excel pivot-tables

Last synced: 30 Jan 2026

https://github.com/jayita11/healthcare-management-optimization-analysis-and-visualization

This project analyzes healthcare data from 2019 to May 2024, optimizing patient care, resource allocation, and financial management. Insights include billing trends, blood bank management, doctor performance, and medication demand, supported by excel,interactive Tableau dashboards and SQL analysis.

data-analysis excel healthcare interactive-dashboards mysql sql tableau-dashboards

Last synced: 23 Mar 2025

https://github.com/parthshah02/customer_churn_dashboard

This repository features a comprehensive project showcasing data analysis and interactive dashboard using Python

data-analysis matplotlib numpy pandas python

Last synced: 13 Apr 2026

https://github.com/vedantshi/tableau-bike-data-dashboard

London Bike Rides Analysis explores bike usage patterns using data visualization and machine learning. It identifies trends through a dynamic moving average, analyzes weather impact with heatmaps, and provides actionable insights via an interactive Tableau dashboard. Tools: Python, Tableau.

data-analysis data-visualization python tableau weather-data

Last synced: 16 May 2026

https://github.com/tolumie/loan-approval-prediction

Loan Approval Prediction using Machine Learning | EDA + Decision Tree, Random Forest & Logistic Regression | Automating loan eligibility for Dream Housing Finance by analyzing customer data and predicting loan approvals.

classification credit-risk-analysis data-analysis decision-tree-classifier finance-analytics loan-approval logistic-regression-algorithm machine-learning predictive-modeling-techniques random-forest

Last synced: 30 Jun 2025

https://github.com/maxbiostat/diehl_ebola_cell_2016

supplementary code and data to Diehl et al, 2016 (Cell)

data-analysis data-visualization disease-spread ebola mutation

Last synced: 11 Jul 2025

https://github.com/ttwag/p9_pandas

Problems that Introduce the DataFrame Object in Python's Pandas Library

data-analysis pandas-dataframe python

Last synced: 10 Jun 2025

https://github.com/tharun2806/end-to-end-internship-data-analysis

Internship Dataset Analysis is an end-to-end project analyzing an internship dataset obtained from Kaggle. The project involves cleaning and preprocessing the data using Excel and SQL, followed by exploratory data analysis (EDA). The analysis includes statistical, sectoral and geospatial insights, visualized through an interactive Tableau dashboard

bigquery data-analysis data-cleaning data-preprocessing data-visualization exploratory-data-analysis geospatial-analysis microsoft-excel reporting sectoral-analysis statistical-analysis tableau-public

Last synced: 01 Apr 2025

https://github.com/samruddhi3012/rfm-sales-analysis

Hi there! In this project I have performed Sales Analysis (RFM Analysis) using SQL and Tableau.

data-analysis data-visualization mssqlserver rfm-analysis segmentation tableau

Last synced: 12 Mar 2025

https://github.com/ved-coder-king/wheat_ai_project

This project, Smart Wheat Farming AI System, was developed as part of the coursework for the Artificial Intelligence program at Esprit School of Engineering.

agriculture data-analysis data-visualization deep-learning image-classification machine-learning object-detection python wheat

Last synced: 15 Apr 2025

https://github.com/sakan811/gachascope

Evaluate the cost-effectiveness of various in-app purchase bundles available in gacha games.

data data-analysis data-visualization game honkai honkai-star-rail honkai-starrail hoyoverse javascript nextjs tableau tableau-public typescript wutheringwaves

Last synced: 04 May 2026

https://github.com/oubiche-ishak19/stock_evaluation_python

A Python script to classify companies based on financial metrics like Piotroski F-Score and Stock Valuation, using CSV financial data for analysis and output.

backtesting-frameworks classification csv-processing data-analysis expert-system finance financial-analysis-tools python rule-based-classifier stock stock-market streamlit tkinter-gui yahoo-finance

Last synced: 15 May 2026

https://github.com/mainak-97/weather-data-analysis-using-python

A comprehensive analysis of time-series weather data using Python and Pandas, focusing on data exploration, cleaning, and uncovering insights.

data-analysis jupyter-notebook pandas pandas-dataframe python python3 time-series-analysis

Last synced: 08 May 2026