Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-06-23 00:07:29 UTC
- JSON Representation
https://github.com/amiraflak/data-mining
Data Mining Course - Spring 2024
classification clustering data-analysis data-mining decision-tree-classifier eda pca
Last synced: 10 Aug 2025
https://github.com/dcostachar/telco-customer-churn-dashboard
An interactive Tableau dashboard using the Telco Customer Churn dataset to analyze key drivers of customer churn and develop data-driven retention strategies for the telecommunications industry.
business-intelligence customer-churn-analysis data-analysis data-visualization marketing-analytics tableau
Last synced: 09 Mar 2026
https://github.com/nafisrayan/decentai
A comprehensive platform built using ReactJS and Flask, combining blockchain technology with AI to create a secure and intelligent space for community engagement and policy discussions. Leverages NLP and LLM for meaningful interactions and sentiment analysis while ensuring data security and user privacy.
chatbot data-analysis data-visualization flask gemini gemini-ai gemini-ai-chatbot gemini-api government government-tech llm mongodb nlp polls python react tailwind voting-systems winknlp
Last synced: 12 Apr 2026
https://github.com/ifigeneiatsiflidou/applied-statistics-project
Project for an Applied Statistics course, involving exploratory data analysis and predictive modeling of movie revenue using engineered features and multiple linear regression.
correlation-analysis data-analysis linear-regression python scikit-learn visualization
Last synced: 29 Apr 2026
https://github.com/gutow/langmuir_trough
Code to run homebuilt Langmuir Trough using Jupyter and Python. Link below for API docs:
data-acquisition data-analysis jupyter langmuir-trough plotting
Last synced: 11 Aug 2025
https://github.com/erayagdogan/simplecharts
Simple Charts is a chart maker compose app with material 3 design. Charts are created using the lets-plot-compose library.
android android-app charts data-analysis data-visualization jetpack-compose lets-plot-kotlin material-3 viewmodel
Last synced: 11 Aug 2025
https://github.com/ct83/become-a-data-analyst-udacity
This repository contains all of the code, projects and reports that I wrote as I pursued my Udacity - Data Analyst NanoDegree.
data-analysis data-analysis-python data-analyst data-visualisation data-visualization-project datascience python udacity udacity-data-analyst-nanodegree
Last synced: 12 Aug 2025
https://github.com/nabilalibou/uber_fare_prediction_explained
This repository documents a complete ML workflow to model Uber fares in Paris, from granular EDA and feature engineering to building and fine-tuning a stacking regressor on 10k real-world rides.
data-analysis data-science eda feature-engineering machine-learning predictive-analytics pricing-model python regression-model stacking-ensemble uber
Last synced: 12 Aug 2025
https://github.com/abhirajp595/python2
Capstone Project using python(Real-Estate)
data-analysis data-science data-visualization jupyter-notebook machine-learning numpy pandas python statistics
Last synced: 09 Apr 2026
https://github.com/arun-data-analyst/finance-reporting-sql
End-to-end SQL project for project/portfolio finance: schema, seed data, validation, data-quality checks, business queries, and KPI views (Power BI–ready).
data-analysis data-modeling data-quality database finance kpi portfolio-management powerbi sql sql-server ssms
Last synced: 18 May 2026
https://github.com/misszeferino/us-traffic-accidents-analysis
Exploratory Data Analysis using Python
data-analysis matplotlib numpy pandas python seaborn
Last synced: 09 Apr 2026
https://github.com/Solrikk/PicTrace-Web
PicTraceV2 is a highly efficient image matching platform that leverages computer vision using OpenCV, deep learning with TensorFlow and the ResNet50 model, asynchronous processing with aiohttp, and Selenium for browser automation. PicTraceV2 allows users to upload images directly or provide URLs, quickly scanning a vast database to find image
automation computer-vision data-analysis data-extraction deep-learning image-processing image-search machine-learning natural-language-processing opencv openpyxl pandas python selenium tensorflow web-scraping yandex yandex-api
Last synced: 15 Aug 2025
https://github.com/sebastiansauer/hans-hackathon2025
Materials for a course on the evaluation of the AI student learn tool "HaNS"
Last synced: 04 Oct 2025
https://github.com/i-e-b/dynamictimewarp
A quick C# implementation of https://jeremykun.com/2012/07/25/dynamic-time-warping/
data-analysis pattern-matching working
Last synced: 17 Aug 2025
https://github.com/chiamakaukwuoma/portfolio
This repository contains various projects I've been privileged to work on outside of work.
aws-rds azure-fabric bigquery data-analysis docker-container elasticsearch excel grafana hadoop looker-studio mssql mysql postgresql powerbi python sql tableau
Last synced: 10 Apr 2026
https://github.com/jcm-ai/Personal-Data-Science-Projects
This page contains all of my personal data science projects. 📊📈📉👨💻
data-analysis data-visualization exploratory-data-analysis jupyter-notebooks machine-learning-algorithms matplotlib-pyplot numpy-library pandas-python personal-project predictive-modeling programming python3 scikit-learn scipy seaborn statistical-analysis
Last synced: 19 Aug 2025
https://github.com/jcm-ai/Quantium-Data-Analytics-Virtual-Experience-Program
This repository contains all about the proposed solutions to the assignments that I was required to complete as part of the Quantium Data Analytics Virtual Experience Program. 📊📈📉👨💻
commercial-thinking communication-skills data-analysis data-validation data-visualisation data-wrangling jupyter-notebook matplotlib-pyplot numpy-library pandas-python presentation-skills programming python3 scipy-stats seaborn statistical-testing
Last synced: 19 Aug 2025
https://github.com/rugwiroparfait/alx_sql
This repo is where I save my queries and learning materials in Data Science program from ALX
anaconda data data-analysis jupyter-notebook sql
Last synced: 19 Aug 2025
https://github.com/cyberoctane29/epa-air-quality-aqi-analysis
This project involved analyzing air quality data from the EPA, focusing on the Air Quality Index (AQI). I used Python data structures like dictionaries and sets to manage and process the data, simulating real-world data analysis to assess pollution levels and their health implications.
data-analysis numpy pandas python statistics
Last synced: 10 Apr 2026
https://github.com/jedrzej-wydra/competition-cooperation
Competition, cooperation, and parental effects in larval aggregations formed on carrion by communally breeding beetles Necrodes littoralis (Staphylinidae: Silphinae)
data-analysis non-linear-regression r
Last synced: 20 Aug 2025
https://github.com/shriansh8619/sql_eda
Explored relational databases using SQL to perform comprehensive Exploratory Data Analysis (EDA), covering database exploration, segmentation, trend analysis, and performance ranking. Developed reusable SQL scripts to analyze dimensions, measures, and time-based metrics, helping uncover key business insights.
data-analysis exploratory-data-analysis mysql
Last synced: 20 Aug 2025
https://github.com/svetlanam/pt-data-analyse
Data analyse of the czech parcel tracking providers
data-analysis matplotlib pandas parcel-tracking python3 visualisation
Last synced: 21 Aug 2025
https://github.com/justinhennis1/hackathon24
Hofstra's Hacknology Competition 2024 - Team Null Pointers
data data-analysis data-science data-visualization data-visualization-python dataanalysis dataanalytics traveling web webapplication
Last synced: 21 Aug 2025
https://github.com/arraypd/data-analysis-with-python-and-sql
data-analysis grafana matplotlib pandas polars postgresql pyspark python seaborn sql
Last synced: 09 Apr 2026
https://github.com/aidan-zamfir/advt-analysis
Web scrapping project. Will eventually use character/episode data for NLP & networking/ data analysis .
data-analysis nlp python selen webscraping
Last synced: 23 Aug 2025
https://github.com/vaishnavipaithane/cyclistic-bike-share-analysis-case-study
This capstone project was done as a part of Google Data Analytics Professional Certificate course.
data-analysis r-programming-language rstudio
Last synced: 24 Aug 2025
https://github.com/nickenshidqia/sql-for-financial-data-analysis
Design SQL queries to generate accurate and timely financial reports including Profit and Loss statements, Balance Sheets, and Cash Flow statements
azure-data-studio data-analysis finance microsoft-sql-server sql
Last synced: 09 Mar 2026
https://github.com/0xnu/data-analyst-training
The repository contains training materials for data analysts.
data data-analysis data-analyst
Last synced: 25 Aug 2025
https://github.com/putuwaw/dashboard-ecommerce
Dashboard for E-Commerce Public Dataset using Streamlit and Plotly
dashboard data-analysis dicoding plotly streamlit
Last synced: 20 Feb 2026
https://github.com/gustavo-zamai/analysis_online_shopping_data
Online Shopping Analysis
csv-files data-analysis pandas plotly-express python3
Last synced: 17 Apr 2026
https://github.com/lauratrigo/dias_geomagneticamente_calmos
📡Script MATLAB que analisa parâmetros ionosféricos (hF, f0F2, hmF2) via FFT, gerando espectros unilaterais/bilaterais para identificar padrões temporais em resolução, crucial para estudos de variações ionosféricas.
data-analysis geophysics matlab scientific
Last synced: 29 Aug 2025
https://github.com/jaseel342/hr_analytics_dashboard
This project showcases the use of Power Query and DAX Query to analyze employee details, add new measures and columns, and create a dashboard using Power BI.
data-analysis dax-query power-query powerbi
Last synced: 03 Jan 2026
https://github.com/karlyndiary/adidas-sales-analysis
Analyzed Adidas' product sales performance, top retailers, monthly trends, yearly growth, regional distribution, and pricing insights. Performed ETL from Python (Pandas) to SQL Server, extracted data with SQL, and visualized key insights in Excel.
adidas-sales-analysis adidas-sales-dashboard dashboard data-analysis data-cleaning data-pipeline data-visualization etl excel-dashboard microsoft-excel microsoft-sql-server python
Last synced: 10 Feb 2026
https://github.com/hess125/data-visualizations
A repository of data visualization projects
data data-analysis data-science data-visualization powerbi projects sql sqlite tableau
Last synced: 31 Aug 2025
https://github.com/agdturner/ccg-data
A modularised Java library for processing data sets with classes for: data records; collections of data records; and identifiers.
Last synced: 12 Jan 2026
https://github.com/luminati-io/walmart-dataset-samples
A sample dataset of over 1000 Walmart products, extracted using the Bright Data API, ideal for consumer market insights and competitor analysis.
api data-analysis dataset walmart walmart-scraper web-scraping
Last synced: 04 Jan 2026
https://github.com/shubhammittal-data/hr_dashboard_tableau
An interactive HR Analytics Dashboard built using Tableau. Provides insights into workforce demographics, hiring trends, salary analysis, and employee records for data-driven decision-making.
chatgpt4 data data-analysis data-visualization drawio-tools faker-generator hr-analytics hr-analytics-dashboard human-resources numpy python tableau tableau-public
Last synced: 17 May 2026
https://github.com/soypete/example-go-dataframes-parser
example of https://godoc.org/github.com/kniren/gota/dataframe
data-analysis data-science datastructures golang-examples ml
Last synced: 12 Sep 2025
https://github.com/carlosvinimsouza/full-tutorial-python
My tutorial Python completed
data-analysis data-science data-structures django django-framework fastapi fastapi-framework flask flask-web frameworks learn-to-code learning python python3 roadmap tutorial tutorial-code
Last synced: 10 Apr 2026
https://github.com/luminati-io/target-dataset-samples
A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.
api data-analysis data-mining datasets target web-scraper web-scraping
Last synced: 04 Jan 2026
https://github.com/mysftz/numerical-methods-in-matlab
Multiple MatLab scripts over multiple data analysis assignments.
data-analysis data-science matlab university university-assignment
Last synced: 14 May 2025
https://github.com/mehrab-kalantari/olympics-data-analysis
A streamlit application to analyze the Olympics dataset from several views
data-analysis streamlit-dashboard streamlit-webapp
Last synced: 20 Apr 2026
https://github.com/leandrocollares/home-team-advantage-in-epl
Home team advantage in the English Premier League: an exploratory data analysis
data-analysis matplotlib pandas plotly
Last synced: 11 Jun 2026
https://github.com/celineboutinon/la-faim-dans-le-monde
OpenClassrooms Data Analyst 2022-2023 - Projet 4
data-analysis data-analytics data-visualisation dataframes matplotlib-pyplot numpy pandas python seaborn
Last synced: 20 Jul 2025
https://github.com/mysftz/statistical-analysis
A in-depth review of statistical analysis in Python from datasets.
data-analysis python python3 statistics university university-project
Last synced: 14 May 2025
https://github.com/kalyan4636/chocos-sales-analysis-report-and-dashboard.-
📊 Built using Power BI, this dashboard delivers actionable insights to boost strategic decision-making. Would you like me to include GitHub tags or a project description for the README as well?
bussiness-analyst data-analysis data-visualization dataanalyst microsoft-power-bi powerbi
Last synced: 26 Jan 2026
https://github.com/nmelgar/birthday_sports_dataviz
We will analyze how the Matthew Effect has influenced in professional sports players.
analysis csv data data-analysis data-science data-visualization datavisualization dataviz probability research tableau
Last synced: 08 Jan 2026
https://github.com/als8446/tripleten-data-science-projects
Projects Overview Projects made in the Data Scientist course from TripleTen LatAm
data data-analysis hypothesis-tests machine matplotlib numpy pandas python scipy sklearn
Last synced: 10 Apr 2026
https://github.com/scailfin/benchmark-templates
Workflow Templates are parameterized workflow specifications for the Reproducible Open Benchmarks for Data Analysis Platform (ROB)
benchmarks data-analysis reproducibility
Last synced: 16 Jan 2026
https://github.com/iness000/online-retail-customer-segmentation
This project performs comprehensive customer segmentation analysis on an online retail dataset using machine learning clustering techniques and RFM (Recency, Frequency, Monetary) analysis. The goal is to identify distinct customer segments to drive better customer relationship management strategies and business insights.
customer-segmentation data-analysis k-means
Last synced: 31 Aug 2025
https://github.com/rdrahul123/ecommerce-sales-dashboard
This project focuses on analyzing e-commerce sales data to uncover actionable insights and improve business decision-making. Using interactive dashboards and data analysis techniques, the project evaluates key performance metrics, customer behavior, sales trends, and payment modes across different categories and regions.
data-analysis data-science excel powerbi
Last synced: 22 Mar 2025
https://github.com/akashprak/socialnetworkads
Predicting customer purchase behavior from the Social Network Ads dataset.
data-analysis machine-learning mlflow pandas python scikit-learn seaborn xgboost
Last synced: 30 Mar 2025
https://github.com/shellynagar27/merchandise-sales-analysis
Merchandise Sales Analysis explores the sales trends of influencer Lee Chatmen’s merchandise using Power BI, and Power Query. The project uncovers key insights on revenue, product performance, location impact, shipping trends, and customer reviews.
critical-thinking data-analysis data-visualization figma powerbi powerquery problem-solving
Last synced: 07 Apr 2025
https://github.com/jayqi/data-analysis-tools
Presentation on Data Analysis Tools
data-analysis presentation-slides
Last synced: 06 Jan 2026
https://github.com/pranjalya/hand-washing-data-visualisation
A small project of Data Visualization, where we analyze the effect of hand washing after introduced by Dr. Semmelweis to the nurses and midwives after giving birth.
data-analysis data-visualization jupyter-notebook pandas python3
Last synced: 06 May 2026
https://github.com/sarincr/training-on-artificial-intelligence
Entree Academy 10 Days free training on Artificial Intelligence. Course will be conducted in a Blended learning way with Daily one hour online training and 3 hour project based training
artificial-intelligence artificial-intelligence-algorithms data-analysis data-science data-visualization decision-trees deep-learning deeplearning logistic-regression machine-learning machine-learning-algorithms machinelearning num numpy pandas regression scikit-learn scipy sklearn
Last synced: 10 Apr 2026
https://github.com/zimmi48/nixpkgs-issues
Analysis on nixpkgs issue lifetime.
data-analysis github-api nixpkgs
Last synced: 10 May 2026
https://github.com/amoneva/cacc
An R Package to compute Conjunctive Analysis of Case Configurations (CACC), Situational Clustering Tests, and Main Effects
criminology data-analysis r social-science
Last synced: 15 May 2025
https://github.com/evanwporter/sloth
Faster Pandas Dataframe
cython data-analysis dataframe pandas
Last synced: 14 Mar 2025
https://github.com/virajbhutada/hr-analytics-excel-sql-tableau-powerbi
Explore a comprehensive HR Analytics portfolio showcasing data analysis and visualization skills. Featuring dashboards in Power BI, Excel, and Tableau, along with SQL queries for deeper insights. A holistic view of expertise in HR analytics, data visualization, and database management. Let's dive into the game of data insights!
data-analysis data-management data-visualization excel hr-analytics interactive-dashboards portfolio-project postgresql powerbi powerbi-visuals sql sql-queries tableau tableau-public
Last synced: 02 Aug 2025
https://github.com/virajbhutada/diamond-price-estimator
This project develops a predictive model to estimate diamond prices based on characteristics like carat, cut, color, and clarity. It covers data preprocessing, feature engineering, model selection, training, and evaluation. The final product is a web app where users can input diamond attributes to get accurate and instant price predictions.
cross-validation css data-analysis data-science-projects data-visualization eda feature-engineering html hyperparameter-tuning jupyter-notebooks machine-learning ml-algorithms model-deployment model-selection performance-optimization predictive-modeling python python-app user-interface
Last synced: 14 Apr 2026
https://github.com/lanzafame/polycarp
[WIP] Subset operations on latlon data read from CSVs
Last synced: 12 Jan 2026
https://github.com/fortunewalla/flight-delays
Data Expo 2009: Airline on time data
airlines data-analysis data-science data36 database dataexpo dataset flightdelays flights ontimedata pgsql postgres postgresql sql tomimester
Last synced: 02 Mar 2026
https://github.com/satvikpraveen/pandasplayground
📊 A comprehensive pandas mastery project with 10 modular Jupyter notebooks covering data loading, cleaning, grouping, merging, time series, visualization, and performance profiling. Includes real-world workflows, Docker, Streamlit, and reusable utils. Ideal for data scientists and analysts to learn, practice, and refer. Practice-ready and modular.
analytics cheatsheet data-analysis data-cleaning data-pipeline data-science data-visualization docker etl exploratory-data-analysis jupyter-notebook jupyterlab learning-resource memory-profiling open-source pandas performance-tuning python streamlit time-series
Last synced: 10 Apr 2026
https://github.com/anuppm9917/super-store-sales-analysis-power-bi-project
My drive to know which products, regions, categories and customer segments a company should target or avoid, I search and selected an appropriate dataset on kaggle which will match a standard superstore requirement.
data data-analysis data-visualization datacleansing excel exploratory-data-analysis jupyter-notebook numpy pandas plotly powerbi python3
Last synced: 10 Apr 2026
https://github.com/hi-jin2/data-analysis-basics
데이터분석기초(R) 수업 중에 작성한 소스코드 모음입니다. 『모두를 위한 R 데이터 분석 입문』 교재를 통해 R언어를 학습하였습니다.
Last synced: 19 Jul 2025
https://github.com/suwa-sh/tbm-template
TBM(Technology Business Management)を小さくはじめるテンプレート
cost-control data-analysis data-visualization dbt dlt example grafana postgresql sample tbm
Last synced: 19 Jun 2025
https://github.com/moenessgannouni/englandweather
A mini-project that analyzes weather data in England usingLinear Regression and Multiple Linear Regression. Ideal for learning and applying statistical analysis and predictive modeling.
data-analysis data-visualization linear-regression multiple-linear-regression rprogramming
Last synced: 22 Mar 2025
https://github.com/barraharrison/spotify-listening-trends
Using EDA to look at song longevity, regional preferences, and streaming behavior in the charts and on Spotify.
data-analysis data-visualization jupyter-notebook kaggle-dataset
Last synced: 03 Feb 2026
https://github.com/idb-devs/dataanalyticsairbnb
Construir um modelo de previsão de preço que permita uma pessoa comum que possui um imóvel possa saber quanto deve cobrar pela diária do seu imóvel.
data-analysis data-science jupyter python
Last synced: 18 Apr 2026
https://github.com/akmj1011/hill-and-valley-prediction-using-logistic-regression
Created A Prediction System Using Logistic Regression For Figuring Out The Hall And Valley From The Given Datasets
cloud-computing data-analysis data-manipulation data-preprocessing data-transformation data-visualization google-colab
Last synced: 13 May 2026
https://github.com/haideratgh/sql-data-analytics-project
This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis
analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-query sql-server window-functions-in-sql
Last synced: 29 Jun 2025
https://github.com/ronylpatil/whatsapp-group-chat-analysis
This project is totally based on data analysis where our college official Whatsapp group is used to extract useful information from the chat. Some of the useful extracted features are most active members of the group, most active day of the week, top-10 media contributors in the Group, and many more...
data-analysis data-preprocessing data-wrangling feature-engineering
Last synced: 14 Jun 2025
https://github.com/sisolieri/prova_ds_saloocupacio2024
Admission challenge to Hackató Saló Ocupació by Barcelona activa
arima barcelona catboost data-analysis data-visualizations forecasting machine-learning pandas public-funding python scikit-learn time-series xgboost
Last synced: 10 Apr 2026
https://github.com/farhad-here/median-performance-comparison
Benchmarking the performance of median calculation using vanilla Python vs NumPy.
data-analysis matplotlib numpy python
Last synced: 18 Apr 2026
https://github.com/mnkanout/patients_medication_prediction
The aim of the project is to create a model that can help medical professionals select the proper medication for patients based on their symptoms. The model uses historical data of other patients to predict what could be the most suitable medication based on the patient's symptoms.
data data-analysis data-science data-visualization decision-tree-classifier machine-learning python3
Last synced: 29 Jun 2025
https://github.com/mstovarh/analisis-de-bebidas-de-starbucks
En este repositorio se encuentran unas gráficas basadas en diversas características de las bebidas de Starbucks, usé tecnologías como la herramienta de Data Analysis de ChatGPT, Excel y PowerQuery.
chatgpt data-analysis excel powerquery
Last synced: 15 Apr 2025
https://github.com/karlyndiary/spotify-excel-dashboard
Data Analysis on the Spotify Dataset using Microsoft Excel and VBA.
charts data-analysis data-cleaning data-visualization excel excel-export excel-vba pivot-tables
Last synced: 04 Jan 2026
https://github.com/satyam4229/omnify-dataanalysis
Our assessment of Omnify focused on data-driven strategies to maximize profitability. We identified "Product X" as the most profitable product and recommended leveraging the "Wellness Solutions" keyword category for optimal keyword strategy.
data-analysis data-science data-visualization excel omnify
Last synced: 04 Jan 2026
https://github.com/aneeshmurali-n/project-ml-data-preprocessing
The main objective of this project is to design and implement a robust data preprocessing system that addresses common challenges such as missing values, outliers, inconsistent formatting, and noise. By performing effective data preprocessing, the project aims to enhance the quality, reliability, and usefulness of the data for machine learning.
data-analysis data-cleaning data-encoding data-exploration feature-scaling label-encoding matplotlib minmaxscaler numpy one-hot-encoding outlier-detection pandas standardscaler
Last synced: 02 May 2026
https://github.com/serlo/data-pipeline-interactive-exercises
processing pipeline for exercise dashboards
Last synced: 26 Feb 2025
https://github.com/skysign/dat
데이터분석을 함께 공부하는 스터디입니다.
data data-analysis data-science
Last synced: 02 Jan 2026
https://github.com/phanchenh/youtube_analysis_rlanguage
Insights into YouTube Channel Performance - A Data-Driven Approach
business-analytics data-analysis data-driven data-visualization etl-pipeline preprocessing r-language r-programming-language
Last synced: 10 Mar 2026
https://github.com/ayberkyavuz/body_type_estimator
This repository is a tutorial for all levels who want to learn how to develop end to end machine learning system.
backend classification css data-analysis dataset end-to-end flask flask-application frontend html javascript machine-learning machine-learning-application material-design materializecss pandas python tutorial webapp xgboost
Last synced: 10 Apr 2026
https://github.com/ronitjariwala/prodigy_ds_02
Prodigy InfoTech Data Science Internship Task-2
Last synced: 28 Apr 2026
https://github.com/andrii04/ga4-gcs-to-bigquery-etl
Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.
automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql
Last synced: 18 May 2026
https://github.com/rahul-jha98/restauranttrends.stats-backend
Application that scrapes the Zomato Dataset and enables the user to visualise the results.
data-analysis data-extraction firebase-storage web-scraping zomato-api
Last synced: 16 Mar 2026
https://github.com/okwilkins/retailanalysis
A comprehensive exploratory analysis and implementation of kmeans/hierarchical clustering on online retail data.
data-analysis data-science machine-learning statistics
Last synced: 18 Oct 2025
https://github.com/gattupalli-saketh/sentiment-analysis-on-products-
Product reviews sentiment analysis.
data-analysis machine-learning nlp review-analysis sentiment-analysis sentiment-classification
Last synced: 18 Apr 2026
https://github.com/badranalyst/titanic-survival-prediction-full-data-science-project-classification
This project predicts Titanic survivors using classification models. It includes data cleaning, pre-processing, exploratory data analysis (EDA), categorical feature conversion, model building, and evaluation. Python libraries like Pandas, NumPy, Matplotlib, and Seaborn are used to analyze and predict survival outcomes.
classification data-analysis data-science eda exploratory-data-analysis machine-learning matplo matplotlib-pyplot ml model numpy pandas predictive-modeling python seaborn
Last synced: 06 May 2026
https://github.com/farzeen-2001/superstore_analysis_sql
Anaylsed the superstore Data using SQl
Last synced: 15 Apr 2025
https://github.com/quocduyenanhnguyen/human-trafficking-analysis
I analyzed human trafficking data
data-analysis data-analytics data-visualization human-trafficking mysql mysql-database mysql-workbench query sql tableau tableau-dashboards tableau-public
Last synced: 02 May 2026
https://github.com/azaz9026/data_cleaning
Welcome to the Data Cleaning repository! This collection is dedicated to showcasing techniques and methods for cleaning and preparing datasets for analysis.
data-analysis data-engineering data-structures data-visualization eda feature-engineering machine-learning numpy outliers pandas python seaborn
Last synced: 13 Apr 2026
https://github.com/sanchittechnogeek/overscripted-analysis
Geolocation and user language extraction analysis from Mozilla Overscripted dataset
analysis data data-analysis mozilla
Last synced: 23 Mar 2025
https://github.com/bibymaths/python_snippets
A collection of Python scripts for bioinformatics data analysis, including tools for transcription counts, nucleotide composition, and protein sequence evaluation.
amino-acid-scoring bioinformatics data-analysis fasta-generation mathematical-evaluation nucleotide-analysis protein-sequence-analysis transcription-counts
Last synced: 29 Jul 2025