Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-06-30 00:07:38 UTC
- JSON Representation
https://github.com/siddhantprateek/machine-learning-resources
Machine Learning Resources
best-practices clustering-algorithm data-analysis deep-learning in-progress journey linear-regression machine-learning machine-learning-algorithms neural-language-modelling neural-language-processing neural-network numpy python3 read reinforcement-learning-algorithms tensorflow visualisation
Last synced: 07 May 2026
https://github.com/aicorsair/python-case-study-ab-testing-for-lunartech-homepage-cta-button
This repository contains a detailed case study on an A/B test of LunarTech's homepage CTA button, using proxy data structured similarly to the company's real data.
ab-testing click-through-rate confidence-intervals data-analysis data-analytics data-exploration data-science data-visualization hypothesis-testing matplotlib normal-distribution numpy pandas practical-significance python statistical-analysis statistical-significance z-critical z-statistic z-test
Last synced: 07 May 2026
https://github.com/jjkay03/discord-call-extractor
Collect HTML data from Discord group/DM to create database of calls
data-analysis database discord discord-tool
Last synced: 07 May 2026
https://github.com/mahmoudnamnam/fc-barcelona-reports
FC Barcelona Reports: An interactive web application to analyze and visualize FC Barcelona's match data. Built with Streamlit, it scrapes match data from WhoScored, stores it in MongoDB, and presents insights through interactive visualizations like pass networks, shot maps, and player statistics.
data-analysis data-visualization football-analytics mplsoccer pandas streamlit web-scraping
Last synced: 07 May 2026
https://github.com/blladerunner/customer-churn-dashboard
Customer Churn Dashboard — SQL + Python analytics project exploring customer retention patterns, churn rate by demographics and services, and key insights for telecom business strategy.
business-intelligence churn-analysis customer-retention dashboard data-analysis data-analytics data-science pandas powerbi python sql sqlite telecom
Last synced: 08 May 2026
https://github.com/danmadeira/algoritmos-estatistica-python
Demonstração de Algoritmos de Estatística em Python
algorithms data-analysis data-science python statistics
Last synced: 08 May 2026
https://github.com/0290192029/apartment-price-predictor
Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".
apartment-price-prediction apartments-for-rent api correios-api data-analysis feature-engineering feature-enginering linear-regression linear-regression-models mlops numpy prediction-model r seaborn
Last synced: 08 May 2026
https://github.com/satvikpraveen/numpymasterpro
A hands-on, production-ready toolkit to master NumPy — from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.
broadcasting data-analysis data-science data-source data-visualization jupyter-notebook kmeans-clustering linear-algebra machine-learning matrix-algebra numerical-computation numpy numpy-broadcasting numpy-examples numpy-tutorial open-source python scientific-computing standardization vectorization
Last synced: 08 May 2026
https://github.com/ryan-wong1/analyzing-arrest-patterns-in-chicago-data-analysis
Chicago Police Department (CPD) arrest data on offenses, locations, and demographics
data-analysis data-cleaning data-visualization exploratory-data-analysis matplotlib pandas python seaborn
Last synced: 08 May 2026
https://github.com/alejandrolara11/data-preprocessing
Data preprocessing through the use of the libraries NumPy and pandas.
data-analysis data-cleaning data-preprocessing numpy pandas python
Last synced: 09 May 2026
https://github.com/mrunmayee3108/financial-chatbot
A Python chatbot for analyzing financial data of companies with revenue, income, assets, cash flow, and debt ratio queries
chatbot data-analysis jupyter-notebook pandas python python3
Last synced: 09 May 2026
https://github.com/drod75/burger_king_analysis
A simple analysis on a burger king dataset.
data-analysis data-visualization jupyter-notebook pandas python seaborn
Last synced: 09 May 2026
https://github.com/master-helix/ibm-data-analyst-certification-stock-analysis-project
This is a mini project repository of my IBM Certification involving stock analysis and plotting of Tesla and GameStop
analytics data data-analysis data-visualization ibm matplotlib pandas python web-scraping
Last synced: 09 May 2026
https://github.com/marvinmarnold/oipm_stop_search
OIPM's analysis on Stop & Search (frisk) activity by the New Orleans Police Department.
data-analysis frisk new-orleans oipm police search stop
Last synced: 22 Jul 2025
https://github.com/salma-mamdoh/exploring-the-evolution-of-linux-project
My Project to learn the Basics of Analysis on DataCamp
data-analysis datacamp pandas python time-series-analysis
Last synced: 09 May 2026
https://github.com/rohithsaji97/face-recognition
Face Recognition using deep learning
data-analysis deep-learning face-recognition keras machine-learning neural-network opencv python training
Last synced: 09 May 2026
https://github.com/vasishta03/econovisionai
A simple Python desktop app to search and explore OECD economic data (CSV) and report summaries (TXT/JSON) using a modern CustomTkinter GUI—no SQL or web frameworks needed.
csv customtkinter data-analysis desktop-app economic-data gui json local-app oecd pandas python search tkinter
Last synced: 10 May 2026
https://github.com/macdon112/credit-card-fraud-detection
Comparing ML models (Random Forest, KNN, Decision Tree) for credit card fraud detection using SMOTE and stratified cross-validation.
classification data-analysis fraud-detection imbalanced-data machine-learning python scikit-learn
Last synced: 10 May 2026
https://github.com/greenpau/esqrunner
Run Elasticsearh queries and create metrics based on the result of the queries in Elasticsearch database.
data-analysis elasticsearch query-builder querydsl
Last synced: 10 May 2026
https://github.com/andersoncrs/aprendizaje_no_supervisado_kmeans_customers
Este repositorio contiene un análisis de datos de clientes de un centro comercial utilizando técnicas de aprendizaje no supervisado, específicamente K Means y clustering jerárquico. El objetivo del proyecto es segmentar a los clientes en grupos homogéneos para entender mejor sus comportamientos y características.
data-analysis kmeans-clustering matplotlib numpy seaborn visualization
Last synced: 10 May 2026
https://github.com/crazy-dot/covid-19-analysis
This project performs an in-depth analysis and visualization of COVID-19 data, focusing on India and its states/union territories.
covid-19-india data-analysis jupyter-notebook matplotlib pandas python3 seaborn
Last synced: 10 May 2026
https://github.com/melissaantunes/ibm-data-analyst-professional
IBM Data Analyst Professional Certificate
analyze-data data-analysis data-analyst data-manipulation data-science data-visualization ibm-data-analyst-professional pandas python
Last synced: 11 May 2026
https://github.com/leticia-ducatti/sales-dashboard-project
Interactive sales dashboard built with Python and Streamlit — shows KPIs, allows filtering, and visualizes sales data.
data-analysis pandas plotly python streamlit
Last synced: 12 May 2026
https://github.com/krypten/playingcardsstatisticalanalysis
Statistical Analysis of Playing Cards (Descriptive Statistics: Final Project)
data-analysis machine-learning machinelearning python statistics udacity
Last synced: 12 May 2026
https://github.com/sebastian-diaz-berdecia/analisis-popularidad-de-series-y-generos-de-series
Consultas SQL para el análisis de la popularidad de series y géneros series de la base de datos NetflixDB.
business-analytics bussiness-intelligence data data-analysis database mysql mysql-database sql
Last synced: 12 May 2026
https://github.com/priyanshu7639/data_visualization_dashboard
An Interactive data visualization tool that combines traditional plotting capabilities with modern AI assistance. It allows users to create and modify visualizations through natural language commands, making data exploration accessible to users of all skill levels.
business-analytics data-analysis data-engineering data-exploration data-science data-visualization datapreprocessing datascience interactive-visualizations matplotlib plotly plotting python research-tool streamlit
Last synced: 12 May 2026
https://github.com/abeltavares/online_retail_pyspark_analysis
PySpark data analysis of the Online Retail Data Set
business-intelligence churn-analysis customer-segmentation data-analysis data-visualization jupyter-notebook machine-learning market-basket-analysis online-retail product-affinity-analysis pyspark
Last synced: 12 May 2026
https://github.com/mituskillologies/dkte-da-mar25
Programs conducted at DKTE's Engineering Institute, Ichalkaranji in training on Python Data Analytics March 2025.
data-analysis matplotlib numpy pandas python-programming tkinter-python
Last synced: 13 May 2026
https://github.com/eslamdyab21/weratedogs-twitter-data-analysis
In this challenging project, I do data wrangling processes
csv data-analysis data-wrangling data-wrangling-twitter json-data pandas python twitter udacity-data-analyst-nanodegree
Last synced: 14 May 2026
https://github.com/prakhargpt/sql-data-warehouse-project
Building Data Warehouse project using SQL Server, including ETL processes, data modelling and analytics.
analytics data data-analysis data-cleaning data-engineering data-engineering-pipeline data-lakehouse data-science data-warehouse etl etl-job etl-pipeline medallion-architecture sql sql-server
Last synced: 12 Jun 2026
https://github.com/saksham-jain177/cryptodataanalysis
A Python powered project that fetches live cryptocurrency data from the CoinMarketCap API, analyzes it, and updates a live Excel sheet every 5 minutes.
api-integration coinmarketcap cryptocurrency data-analysis excel live-data python
Last synced: 12 Jun 2026
https://github.com/cannt39t/wylsacom-analysis-reflinks-datamining
data data-analysis data-mining python3 sql
Last synced: 13 Jun 2026
https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020
Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).
bigquery data data-analysis data-visualization python sql tableau
Last synced: 15 Jun 2026
https://github.com/victoryfanfare/car-price-prediction
ML модель для определения рыночной стоимости автомобилей с пробегом. Проект включает анализ данных, feature engineering и сравнение различных алгоритмов машинного обучения.
catboost data-analysis jupyter-notebook lightgbm machine-learning pandas python regression
Last synced: 15 Jun 2026
https://github.com/dcs-training/data-wrangling-and-vis-pandas
Introduction to analyzing structured data with the Python libraries pandas, for CSV and TSV data, and ElementTree, for XML data. Go to the readme file
data-analysis data-visualisation data-wrangling python
Last synced: 16 Jun 2026
https://github.com/fahadnasir13/financial_data-analyzer_tool
A Python-based framework for analyzing, cleaning, and reconciling financial data stored in Excel workbooks.
data-analysis excel financial python store
Last synced: 17 Jun 2026
https://github.com/ibttf/bayborhood
Interactive map to find the ideal neighborhood in San Francisco based on data.
data data-analysis data-visualization gis mapbox react
Last synced: 18 Jun 2026
https://github.com/sakan811/stress-pattern-occurrence-in-english-words
This project is intended to provide English learners with data that allows them to make a data-driven guess when encountering words that they aren't sure where to stress
data-analysis data-visualization english english-language english-learning language powerbi powerbi-report powerbi-visuals
Last synced: 20 Jun 2026
https://github.com/katiebuntic/research_methods
Data Science Research Methods
analysis data-analysis data-science python research-project
Last synced: 23 Jun 2026
https://github.com/engusseus/warframe-market-set-profit-analyzer
Python tool that analyzes Warframe Market data to find profitable item sets to trade
api data-analysis python trading waframe
Last synced: 23 Jun 2026
https://github.com/ladaegorova18/data_analysis
Learning the basics of data analysis in Python
analytics data-analysis data-visualization steam-games
Last synced: 24 Jun 2026
https://github.com/lu-m-dev/biostatistics-eda
Exploratory data analysis and visualization system for biostatistical research
biostatistics data-analysis data-visualization eda
Last synced: 25 Jun 2026
https://github.com/souza-vitor/stock-market
codecademy data data-analysis data-mining data-science sql sqlite
Last synced: 26 Jun 2026
https://github.com/chdre/data-analyzer
A small package to analyze and preprocess data.
Last synced: 28 Jun 2026
https://github.com/syarwinaaa09/analyzing-students-mental-health
data-driven exploration into student mental health trends using survey data
csv-dataset data-analysis education jupyter-notebook mental-health-awareness pandas psychology student-mental-health visualization
Last synced: 29 Jun 2026
https://github.com/manganite/vibespin
VibeSpin is a Python framework for simulating and analyzing 2D lattice spin systems (Ising, XY, and q-state Clock models) with Numba-accelerated Monte Carlo dynamics, correlation/structure diagnostics, and reproducible benchmarking workflows.
clock-model critical-phenomena data-analysis ising-model lattice-models monte-carlo-simulation phase-transitions physics-simulation python scientific-computing spin-models spin-systems statistical-mechanics xy-model
Last synced: 29 Jun 2026
https://github.com/rubyyy1118/share-price-analysis
The assignment in my MSc Business Analytics course
data-analysis data-preprocessing data-science data-visualization matplotlib numpy pandas python seaborn
Last synced: 10 Apr 2026
https://github.com/celineboutinon/product-classification
CentraleSupélec/OpenClassrooms Data Scientist 2024-2025 - Projet 6
api classification-models data-analysis data-science data-visualization e-commerce image-classification marketing marketing-analytics product-classification rgpd scraping-python text-classification
Last synced: 29 Jun 2026
https://github.com/kaoutarmi/analyse-des-ventes-pour-optimiser-la-performance
Analyse des données de ventes pour identifier des opportunités d'amélioration des performances commerciales. Utilisation de Pandas pour le traitement des données, et Matplotlib/Seaborn pour la visualisation des tendances et des résultats.
business-intelligence data-analysis data-visualization jupyter-notebook matplotlib pandas sales-optimization seaborn
Last synced: 20 Aug 2025
https://github.com/souravsuvarna/whatsapp-chat-analyzer-and-visualizer-web-application
The WhatsApp chat analyzer and visualizer uses NLP algorithms to analyze chat data, tracking usage patterns and presenting insights through visually appealing charts and graphs. It helps users understand communication patterns and behaviors on WhatsApp.
data-analysis data-science data-visualization python python3 streamlit
Last synced: 18 Apr 2026
https://github.com/svetlanam/pt-data-analyse
Data analyse of the czech parcel tracking providers
data-analysis matplotlib pandas parcel-tracking python3 visualisation
Last synced: 21 Aug 2025
https://github.com/arraypd/data-analysis-with-python-and-sql
data-analysis grafana matplotlib pandas polars postgresql pyspark python seaborn sql
Last synced: 09 Apr 2026
https://github.com/marknature/machine-learning-intern
Machine Learning tasks involving the Titanic Dataset and Breast Cancer Wisconsin (Diagnostic) dataset
data-analysis github jupiter-notebook machine-learning matplotlib numpy pandas python scikit-learn sklearn
Last synced: 10 Apr 2026
https://github.com/prince-pastakiya/human-resources-tableau-project
👥 Interactive Tableau dashboard for HR analytics — includes workforce overview, demographics, income analysis, and detailed employee records with full filtering.
chatgpt data-analysis data-visualization human-resources numpy python python-faker tableau-dashboards tableau-public
Last synced: 18 Apr 2026
https://github.com/vaishnavipaithane/cyclistic-bike-share-analysis-case-study
This capstone project was done as a part of Google Data Analytics Professional Certificate course.
data-analysis r-programming-language rstudio
Last synced: 24 Aug 2025
https://github.com/shridhar1504/sql-projects
The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.
data-analysis data-mining data-science data-transformation eda etl-framework microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries
Last synced: 09 Mar 2026
https://github.com/nickenshidqia/sql-for-financial-data-analysis
Design SQL queries to generate accurate and timely financial reports including Profit and Loss statements, Balance Sheets, and Cash Flow statements
azure-data-studio data-analysis finance microsoft-sql-server sql
Last synced: 09 Mar 2026
https://github.com/ssiarhei115/shop-customers-segmentation
Shop customers segmentation
data data-analysis data-science data-visualization
Last synced: 24 Aug 2025
https://github.com/pedramjlo/uae_cars_analysis
Analysis of the UAE second-hand car data
data-analysis jupyter-notebook pandas python sql sqlite3
Last synced: 11 May 2026
https://github.com/0xnu/data-analyst-training
The repository contains training materials for data analysts.
data data-analysis data-analyst
Last synced: 25 Aug 2025
https://github.com/debjyotisaha/tableau-projects-phase-2
Published interactive dashboards on Tableau Public, highlighting expertise in data visualization and storytelling through analyses of transportation patterns, sales trends, and demographic studies. These projects showcase the ability to transform complex datasets into actionable, intuitive visuals for decision-making.
dashboards data data-analysis data-visualisation tableau
Last synced: 26 Aug 2025
https://github.com/putuwaw/dashboard-ecommerce
Dashboard for E-Commerce Public Dataset using Streamlit and Plotly
dashboard data-analysis dicoding plotly streamlit
Last synced: 20 Feb 2026
https://github.com/sarathchandranpm/walmart-sales-analysis
Analysis of Walmart Myanmar's Q1 2019 sales data covering customer behavior, product performance, general operations, and sales patterns.
Last synced: 29 Aug 2025
https://github.com/lauratrigo/fft_matlab
📡Análise de Fourier para Dados Ionosféricos é um script MATLAB que aplica FFT para gerar espectros unilaterais e bilaterais de parâmetros ionosféricos (hF, f0F2, hmF2), identificando periodicidades e comparando assinaturas espectrais com resolução de 15 minutos, útil para estudos de variações e distúrbios ionosféricos.
data-analysis fast-fourier-transform fft fourier ionosphere matlab scientific scientific-initiation
Last synced: 29 Aug 2025
https://github.com/roggersanguzu/weather-medical-expense-prediction-ml-models
This repo contains a model for determining the rainfall patterns and another for medical expense prediction model
data data-analysis data-science datasets joblib machine-learning machine-learning-algorithms scikitlearn-machine-learning
Last synced: 30 Aug 2025
https://github.com/karlyndiary/adidas-sales-analysis
Analyzed Adidas' product sales performance, top retailers, monthly trends, yearly growth, regional distribution, and pricing insights. Performed ETL from Python (Pandas) to SQL Server, extracted data with SQL, and visualized key insights in Excel.
adidas-sales-analysis adidas-sales-dashboard dashboard data-analysis data-cleaning data-pipeline data-visualization etl excel-dashboard microsoft-excel microsoft-sql-server python
Last synced: 10 Feb 2026
https://github.com/hess125/data-visualizations
A repository of data visualization projects
data data-analysis data-science data-visualization powerbi projects sql sqlite tableau
Last synced: 31 Aug 2025
https://github.com/agdturner/ccg-data
A modularised Java library for processing data sets with classes for: data records; collections of data records; and identifiers.
Last synced: 12 Jan 2026
https://github.com/ragedunicorn/mantisx-notebook
A repository for Jupyter notebooks analysing mantisx data
data-analysis data-visualization mantis mantisx shooting training
Last synced: 24 Jul 2025
https://github.com/poglolopez/prueba_tecnica_inlaze
Este repositorio muestra mis habilidades en análisis de datos a través de una prueba técnica para Inlaze. Incluye flujos de trabajo con Python, SQLite y Power BI para analizar el comportamiento de jugadores, depósitos y rendimiento de fuentes de tráfico, destacando eficiencia operativa e información estratégica.
data-analysis data-v etl jupyter powerbi python sqlite
Last synced: 26 Feb 2025
https://github.com/devanshsahu47/talentscape-glassdoor-analysis
TalentScape is an end-to-end Python project that cleans and analyzes a comprehensive Glassdoor Jobs dataset. It features robust data wrangling and 20 insightful visualizations to uncover trends in job titles, salary ranges, company ratings, and more—providing actionable recommendations to optimize recruitment and compensation strategies.
business-intelligence data-analysis data-vizualisation jupyter-notebook python3
Last synced: 15 May 2026
https://github.com/soypete/example-go-dataframes-parser
example of https://godoc.org/github.com/kniren/gota/dataframe
data-analysis data-science datastructures golang-examples ml
Last synced: 12 Sep 2025
https://github.com/singhs05/global-youtube-trends
Understand the impact of Likes, comments, dislikes on the video consumption for the videos that were trending.
data-analysis mssqlserver query sql
Last synced: 18 Mar 2026
https://github.com/luminati-io/target-dataset-samples
A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.
api data-analysis data-mining datasets target web-scraper web-scraping
Last synced: 04 Jan 2026
https://github.com/mysftz/numerical-methods-in-matlab
Multiple MatLab scripts over multiple data analysis assignments.
data-analysis data-science matlab university university-assignment
Last synced: 14 May 2025
https://github.com/leandrocollares/home-team-advantage-in-epl
Home team advantage in the English Premier League: an exploratory data analysis
data-analysis matplotlib pandas plotly
Last synced: 11 Jun 2026
https://github.com/mysftz/statistical-analysis
A in-depth review of statistical analysis in Python from datasets.
data-analysis python python3 statistics university university-project
Last synced: 14 May 2025
https://github.com/nmelgar/birthday_sports_dataviz
We will analyze how the Matthew Effect has influenced in professional sports players.
analysis csv data data-analysis data-science data-visualization datavisualization dataviz probability research tableau
Last synced: 08 Jan 2026
https://github.com/scailfin/benchmark-templates
Workflow Templates are parameterized workflow specifications for the Reproducible Open Benchmarks for Data Analysis Platform (ROB)
benchmarks data-analysis reproducibility
Last synced: 16 Jan 2026
https://github.com/iness000/online-retail-customer-segmentation
This project performs comprehensive customer segmentation analysis on an online retail dataset using machine learning clustering techniques and RFM (Recency, Frequency, Monetary) analysis. The goal is to identify distinct customer segments to drive better customer relationship management strategies and business insights.
customer-segmentation data-analysis k-means
Last synced: 31 Aug 2025
https://github.com/rdrahul123/ecommerce-sales-dashboard
This project focuses on analyzing e-commerce sales data to uncover actionable insights and improve business decision-making. Using interactive dashboards and data analysis techniques, the project evaluates key performance metrics, customer behavior, sales trends, and payment modes across different categories and regions.
data-analysis data-science excel powerbi
Last synced: 22 Mar 2025
https://github.com/jayqi/data-analysis-tools
Presentation on Data Analysis Tools
data-analysis presentation-slides
Last synced: 06 Jan 2026
https://github.com/pranjalya/hand-washing-data-visualisation
A small project of Data Visualization, where we analyze the effect of hand washing after introduced by Dr. Semmelweis to the nurses and midwives after giving birth.
data-analysis data-visualization jupyter-notebook pandas python3
Last synced: 06 May 2026
https://github.com/zimmi48/nixpkgs-issues
Analysis on nixpkgs issue lifetime.
data-analysis github-api nixpkgs
Last synced: 10 May 2026
https://github.com/amoneva/cacc
An R Package to compute Conjunctive Analysis of Case Configurations (CACC), Situational Clustering Tests, and Main Effects
criminology data-analysis r social-science
Last synced: 15 May 2025
https://github.com/virajbhutada/diamond-price-estimator
This project develops a predictive model to estimate diamond prices based on characteristics like carat, cut, color, and clarity. It covers data preprocessing, feature engineering, model selection, training, and evaluation. The final product is a web app where users can input diamond attributes to get accurate and instant price predictions.
cross-validation css data-analysis data-science-projects data-visualization eda feature-engineering html hyperparameter-tuning jupyter-notebooks machine-learning ml-algorithms model-deployment model-selection performance-optimization predictive-modeling python python-app user-interface
Last synced: 14 Apr 2026
https://github.com/lanzafame/polycarp
[WIP] Subset operations on latlon data read from CSVs
Last synced: 12 Jan 2026
https://github.com/fortunewalla/flight-delays
Data Expo 2009: Airline on time data
airlines data-analysis data-science data36 database dataexpo dataset flightdelays flights ontimedata pgsql postgres postgresql sql tomimester
Last synced: 02 Mar 2026
https://github.com/satvikpraveen/pandasplayground
📊 A comprehensive pandas mastery project with 10 modular Jupyter notebooks covering data loading, cleaning, grouping, merging, time series, visualization, and performance profiling. Includes real-world workflows, Docker, Streamlit, and reusable utils. Ideal for data scientists and analysts to learn, practice, and refer. Practice-ready and modular.
analytics cheatsheet data-analysis data-cleaning data-pipeline data-science data-visualization docker etl exploratory-data-analysis jupyter-notebook jupyterlab learning-resource memory-profiling open-source pandas performance-tuning python streamlit time-series
Last synced: 10 Apr 2026
https://github.com/moenessgannouni/englandweather
A mini-project that analyzes weather data in England usingLinear Regression and Multiple Linear Regression. Ideal for learning and applying statistical analysis and predictive modeling.
data-analysis data-visualization linear-regression multiple-linear-regression rprogramming
Last synced: 22 Mar 2025
https://github.com/ronylpatil/whatsapp-group-chat-analysis
This project is totally based on data analysis where our college official Whatsapp group is used to extract useful information from the chat. Some of the useful extracted features are most active members of the group, most active day of the week, top-10 media contributors in the Group, and many more...
data-analysis data-preprocessing data-wrangling feature-engineering
Last synced: 14 Jun 2025
https://github.com/farhad-here/median-performance-comparison
Benchmarking the performance of median calculation using vanilla Python vs NumPy.
data-analysis matplotlib numpy python
Last synced: 18 Apr 2026
https://github.com/mnkanout/patients_medication_prediction
The aim of the project is to create a model that can help medical professionals select the proper medication for patients based on their symptoms. The model uses historical data of other patients to predict what could be the most suitable medication based on the patient's symptoms.
data data-analysis data-science data-visualization decision-tree-classifier machine-learning python3
Last synced: 29 Jun 2025
https://github.com/karlyndiary/spotify-excel-dashboard
Data Analysis on the Spotify Dataset using Microsoft Excel and VBA.
charts data-analysis data-cleaning data-visualization excel excel-export excel-vba pivot-tables
Last synced: 04 Jan 2026
https://github.com/aneeshmurali-n/project-ml-data-preprocessing
The main objective of this project is to design and implement a robust data preprocessing system that addresses common challenges such as missing values, outliers, inconsistent formatting, and noise. By performing effective data preprocessing, the project aims to enhance the quality, reliability, and usefulness of the data for machine learning.
data-analysis data-cleaning data-encoding data-exploration feature-scaling label-encoding matplotlib minmaxscaler numpy one-hot-encoding outlier-detection pandas standardscaler
Last synced: 02 May 2026
https://github.com/skysign/dat
데이터분석을 함께 공부하는 스터디입니다.
data data-analysis data-science
Last synced: 02 Jan 2026
https://github.com/ronitjariwala/prodigy_ds_02
Prodigy InfoTech Data Science Internship Task-2
Last synced: 28 Apr 2026
https://github.com/andrii04/ga4-gcs-to-bigquery-etl
Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.
automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql
Last synced: 18 May 2026
https://github.com/farzeen-2001/superstore_analysis_sql
Anaylsed the superstore Data using SQl
Last synced: 15 Apr 2025