Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-06-23 00:07:29 UTC
- JSON Representation
https://github.com/monish-nallagondalla/diamondpriceprediction
Diamond Price Prediction is an end-to-end machine learning project that predicts diamond prices based on attributes like carat, cut, color, clarity, and dimensions. It features a Flask web application for real-time predictions and utilizes models such as Linear Regression, Lasso, and Ridge.
data-analysis data-science flask jupyter-notebooks machine-learning predictive-modeling python
Last synced: 06 May 2026
https://github.com/freebirdscrew/covid-19-data-analysis
Coronavirus Data-Analysis with Live Data Streaming from the Website and Made a DASH Web-App at Last.
coronavirus coronavirus-real-time coronavirus-tracking countryinfo covid-19 covid-19-india covid19 covid19-data dash dash-button dashboard-application data data-analysis data-cleaning data-science data-visualization github jupyter pycountry python
Last synced: 07 May 2026
https://github.com/muneeb1030/eda-of-physionets-ecg
EDA of Physionet Data set regarding "A Large Scale 12 Lead Electrocardiogram Database for Arrhythmia Study 1.0.0". This project focuses on the preprocessing of electrocardiogram (ECG) signals and utilizes Principal Component Analysis (PCA) for dimensionality reduction
12-lead-ecg data-analysis ecg-signal eda pca python3 wfdb
Last synced: 25 Jul 2025
https://github.com/hamzacham/data_set_projet-5
data-analysis data-science database dataset jupyter jupyter-notebook paython training
Last synced: 07 May 2026
https://github.com/sarathchandranpm/restaurant_order_analysis
This project entails an in-depth analysis of a restaurant's order and menu data. The focus is on exploring customer ordering behaviors, menu item attributes, and order specifics. By investigating the connections between order details, menu items, and order dates, the project seeks to generate valuable insights into the restaurant's operations.
Last synced: 10 Apr 2025
https://github.com/backdoorali/insider-threat-detection-project
Personal data analysis project combining insider threat detection, cybersecurity, and exploratory data analytics. Built for portfolio showcase and practical skills demonstration.
cybersecurity data-analysis data-analysis-excel data-analysis-project data-analyst data-analytics data-visualization eda excel insider-threat jupyter-lab jupyter-notebook matplotlib numbers pandas portfolio-project python python3 threat-detection threat-intelligence
Last synced: 07 May 2026
https://github.com/cworld1/novel-analysis
A simple project for analyzing Chinese novels
Last synced: 17 Mar 2025
https://github.com/robinmillford/analytics_for_fashion_supply_management
This Streamlit dashboard provides a comprehensive analysis of supply chain data, focusing on key metrics such as production volumes, stock levels, order quantities, revenue, manufacturing costs, lead times, shipping costs, transportation routes, risk factors, and sustainability factors
dashboard data-analysis data-visualization streamlit supply-chain-management
Last synced: 07 Sep 2025
https://github.com/prime-infinity/type-one
Software to visualize and analyze GitHub repos based on certain statistics such as stars, forks and issues
data-analysis data-visualization
Last synced: 03 Feb 2026
https://github.com/jubinjacob03/heartdiseaseclassify-ml
Heart Disease Dataset Analysis & Classification using ML models such as linear, support vector machine, k-means, k-nearest neighbors and logistic regression.
data-analysis data-science data-visualization ipython-notebook kaggle-dataset kmeans knn linear-regression logistic-regression machine-learning matplotlib python seaborn support-vector-machine
Last synced: 18 Jan 2026
https://github.com/guglielmo/datalab-notebooks
Data analysis at openpolis
data-analysis data-science jupyter-notebooks pandas python3
Last synced: 08 May 2026
https://github.com/nickchristopherson/duluth-tourism-analysis
End-to-End Data Pipeline for Tourism Industry Analysis
data-analysis data-visualization duluth economic-analysis jupyter pandas pdf-extraction python tourism
Last synced: 08 May 2026
https://github.com/jethronap/jstat-gui
Web-based GUI application for data analysis
data-analysis data-visualization java jstat mongodb
Last synced: 08 May 2026
https://github.com/heiderjeffer/enhancing-digital-maturity-and-analytical-capabilities-of-smes
Research Proposals RP
analytics data-analysis data-driven digital framework jupyter modeling-and-simulation pyrhon quantative smes statistical-analysis stochastic-processes
Last synced: 02 Apr 2025
https://github.com/cagandemirmr/google-play-yorum-analizi
Türkiyede 2024 yılında en çok beğenilen My Supermarket Simulator 3D oyununa ait yorumların duygu durumu,yorumların beğeni sayısını,Firmanın geri dönüşleri ve kullanıcı nicknameleri gibi değişkenleri analiz ederek içgörü topladım.
bert data-analysis data-science nlp
Last synced: 10 Jun 2026
https://github.com/aekanshd/crazytics-suicidesindia
Basic interpretation of the Suicides in India data-set using R.
data-analysis data-science graph india r suicides
Last synced: 10 Jun 2026
https://github.com/sermonzagoto/data_manipulation_with_pandas
Data Manipulation with Pandas - Part 1
data-analysis data-science jupyter-notebook pandas-python python
Last synced: 09 May 2026
https://github.com/chigwell/partnershipparser
partnershipparser extracts and structures key info from tech partnership news for easy analysis of companies, focus areas, and impacts
business-analysts collaborating-companies competitive-advantages consistency data-analysis data-integration focus-area goals investors market-opportunities potential-impact researchers standardized-output strategic-alliances strategic-partnerships structured-data tech-industry technology-sector text-extraction trends
Last synced: 14 Jan 2026
https://github.com/jayita11/atliqo-bank-credit-card-launch-eda
This project involves exploratory data analysis and statistical testing for AtliQo Bank's new credit card launch. Key insights include targeting high-income occupations and the 18-25 age group. Recommendations focus on tailored marketing campaigns, education, and incentives to enhance credit card adoption and usage among young adults.
data-analysis hypothesis-testing matplotlib p-value pandas python seaborn statistics z-test
Last synced: 09 Apr 2026
https://github.com/devexpress-examples/aspnet-pivot-grid-custom-aggregates
This example shows how to aggregate data by the field's first value.
asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms
Last synced: 06 Jul 2025
https://github.com/ahmednasef3/udemy-courses-full-eda
Exploratory Data Analysis on the factors that can affect the promotions and earnings in Udemy Courses and the perfect way to make a good saled course in Udemy.
data-analysis data-science data-visualization eda exploratory-data-analysis matplotlib pandas seaborn udemy-course-project
Last synced: 01 May 2026
https://github.com/mariam-badr-mb/gtc-ml-project2-diabetes-prediction
This project is part of the GTC Machine Learning Program. It demonstrates the end-to-end ML workflow by building a predictive model for diabetes detection
classification-algorithm data-analysis data-visualization diabetes-prediction gridsearchcv hyperparameter-tuning machine-learning python
Last synced: 09 May 2026
https://github.com/asifdotexe/air-quality-analysis-aqa
AQA is a data-driven project focused on analyzing air quality data sourced from data.gov.in. The project encompasses data preprocessing, analysis, and visualization to gain insights into air pollution levels across various locations in India. By examining six key pollutants, the project aims to raise awareness about the environmental issues
aqi-analysis data-analysis data-preprocessing data-science data-visualization presentation
Last synced: 07 Jun 2026
https://github.com/happybono/sonatasmooth
Provides three different noise reduction algorithms for smoothing out data : Rectangular Averaging, Binomial Median Filtering, and Binomial Averaging. It processes data from a list and displays the results in another list.
algorithms average binomial binomial-coefficient binomial-theorem calibration csharp data-analysis data-calibration dynamic-noise-reduction median noise-algorithms noise-reduction noise-reduction-kernel outliers rectangular-averaging windows-desktop windows-desktop-application windows-forms winforms
Last synced: 30 Oct 2025
https://github.com/nafisalawalidris/northwind-traders-sales-analysis
Northwind Traders Sales Analysis project, which analyses sales data for a fictitious company. It utilises the Northwind Database and includes SQL queries to provide insights on employees, products, suppliers and revenue. The project aims to help the company gain valuable information for business decision-making.
business-insights data-analysis database northwind-traders sales sql
Last synced: 07 Aug 2025
https://github.com/anushadatta/airbnb-in-seattle
🏨 Understanding the Airbnb rental landscape in Seattle using data science.
airbnb data-analysis data-exploration data-visualization datascience sentiment-analysis
Last synced: 13 Jun 2025
https://github.com/gauranshgoel123/predictive-demand-analysis
Demand Forecasting Project A web application for predicting future demand for part numbers based on historical data. Built with React for the frontend and FastAPI with Python for the backend, this application visualizes demand trends and allows users to input additional data for improved accuracy. In render analyzer is frontend analysis is backend
chartjs data-analysis data-science data-visualization dataset deployment full-stack machine-learning numpy pandas predictive-analysis prophet-model python reactjs render
Last synced: 13 Apr 2026
https://github.com/muneeb1030/dataannotation
This streamlines the process of annotating data for machine learning tasks, making it easier and more efficient for teams to create labeled datasets by leveraging Label Studio and Bulk
bulk data-analysis data-annotation label-studio python
Last synced: 10 May 2026
https://github.com/zpreisler/modules
Python libraries and modules for processing simulation outputs
data-analysis python scripts tensorflow
Last synced: 13 May 2026
https://github.com/whis99/userfunnelanalysis
An ecommerce user funnel conversion data analysis with matplotlib & python.
data-analysis data-analysis-python data-analyst data-visualization google-colab jupyter-notebook matplotlib python
Last synced: 13 Apr 2026
https://github.com/pferreirafabricio/data-immersion
🏊🏻♂️ Activities and exercises from 'Imersão Dados' event
data data-analysis data-science dataset jupiter-notebook python
Last synced: 14 May 2026
https://github.com/incubrain/awesome-maharashtra-data
A collection of datasets specific to Maharashtra, India. WIP
ai artificial-intelligence data data-analysis data-science datasets maharashtra marathi
Last synced: 23 May 2026
https://github.com/jatin-mehra119/bike-rentals-dataset
This repository focuses on optimizing bike rental availability during peak hours and days using machine learning techniques. Leveraging publicly available data from the UCI Machine Learning Repository, it includes scripts for data preprocessing, model training, and visualization, along with detailed observations and results.
data-analysis data-science ensemble-model pandas scikitlearn-machine-learning
Last synced: 15 Apr 2026
https://github.com/0xpr03/clantool
CF Management & Data Analysis Tool, crawler backend in rust
backend-server crawler data-analysis rust
Last synced: 05 Feb 2026
https://github.com/abhi18av/innovation-competition
Submission for a programming challenge
clojure clojurescript data-analysis
Last synced: 13 Jun 2026
https://github.com/gustavo-zamai/product_return_data_analysis
Analysis returns products of differents stores
data-analysis excel pandas plotly-express python3 pywin32
Last synced: 13 May 2026
https://github.com/ryannapp12/quant_trading_engine
A modular, and scalable quantitative trading engine built in Python. This project demonstrates efficient data caching with SQLite, concurrent backtesting, and advanced risk analytics, showcasing best practices in clean code architecture and performance optimization.
algorithmic-trading backtesting dash data-analysis data-visualization fintech lstm machine-learning numpy pandas plotly python quantitative-finance real-time risk-management sqlite technical-analysis tensorflow time-series-analysis trading-strategies
Last synced: 11 Apr 2026
https://github.com/kaz-yos/distributed
Comparison of Privacy-Protecting Analytic and Data-sharing Methods: a Simulation Study (Pharmacoepidemiol Drug Saf 2018)
data-analysis epidemiology statistics
Last synced: 15 Jun 2026
https://github.com/luochang212/weibo-analysis
Data analysis based on sina weibo.
Last synced: 03 Apr 2026
https://github.com/parmeetbhamrah/air-quality-india-analysis
Exploratory data analysis of real-time air quality data from Indian cities using Python, Pandas, Matplotlib, and Seaborn.
air-quality data-analysis eda exploratory-data-analysis government-data india matplotlib numpy pandas python seaborn
Last synced: 05 May 2026
https://github.com/mindgamesnl/yanderestats
https://mindgamesnl.github.io/YandereStats/
data-analysis reporting-pipeline yandere yandere-sim
Last synced: 18 Jun 2026
https://github.com/rayyan9477/diamond-price-forecasting
This is a comprehensive machine learning project focused on predicting diamond prices. Using a dataset of diamond attributes, the project implements various machine learning models to forecast prices. Key features include data preprocessing, exploratory data analysis (EDA), and model training with algorithms such as Linear Regression, Decision Tree
data-analysis data-science decision-trees eda linear-regression machine-learning
Last synced: 26 Jul 2025
https://github.com/rafiulgits/data-analysis
Data Analysis with python programming language
classification data-analysis data-mining data-visualization machine-learning mglearn regression regression-models sklearn
Last synced: 08 May 2026
https://github.com/derrickbaruga7/mapping-median-age-europe
An R project that creates an interactive map of the median age across European regions using Eurostat data and spatial visualization packages.
data-analysis data-science data-visualization datascience european-union mapping r
Last synced: 25 Mar 2025
https://github.com/datadotworld/dw-jupyter-contents
Jupyter ContentsManager implementation for data.world
data data-analysis data-science dwstruct-t50-public-projects jupyter jupyter-notebook jupyterlab reference-implementation
Last synced: 22 Jun 2026
https://github.com/rogernet/desafio-profissional-produto-data-driven
Ajudar a formar Analistas de Produto, PMs e Gestores de Negócio capazes de tomar decisões estratégicas baseadas em dados.
data-analysis data-science data-visualization product
Last synced: 23 Jun 2026
https://github.com/ayaanjawaid/google_playstore_data_analysis
This project provides an in-depth analysis of Google Play Store apps and user reviews, focusing on understanding app performance, user sentiment, and key trends in app categories. Using Python, I performed data cleaning, feature engineering, and exploratory data analysis (EDA) on app data and reviews.
data-analysis eda html numpy pandas-dataframe plotly python vizualisation
Last synced: 24 Feb 2026
https://github.com/fatihilhan42/tourist_analysis_in_turkey_with_python
In this project, the number of tourists coming to Turkey between 2008-2021 was analyzed. The data from the data set you can find in the warehouse was first organized using data cleaning algorithms. These cleaned data were then output graphically using data visualization algorithms.
data-analysis data-cleaning data-science data-visualization jupyter-notebook python
Last synced: 03 May 2026
https://github.com/zients/tw-lottery-recommandation
Taiwan lottery draw analyzer & number recommender with Transformer ML model. Supports 539, 649, 638, 3D, and 4D lotteries.
cli data-analysis lottery machine-learning python pytorch taiwan transformer
Last synced: 03 May 2026
https://github.com/cassiofb-dev/projetos-intensivao-python
Projetos do evento intensivão de Python da Hashtag treinamentos.
automation data-analysis data-science data-visualization jupyter-notebook machine-learning python webscraping
Last synced: 03 May 2026
https://github.com/maddieemihle/python-challenge
Creating a Python script that analyzes financial records and election results
Last synced: 09 Jun 2026
https://github.com/ababic/dumpling
Fast, flexibile, powerful static data anonymisation for SQL dumps
anonymisation cli data-analysis data-science pii pii-redaction postgres privacy rust rust-lang scrubber scrubbing security tooling
Last synced: 03 May 2026
https://github.com/syed-m-nofel/python-data-science-fundamentals
Python notebooks for data manipulation (Pandas/NumPy) and API workflows – from basics to practical examples.
api beginner-friendly data-analysis data-science http-requests jupyter-notebook numpy pandas pandas-dataframe python tutorial
Last synced: 03 May 2026
https://github.com/joelfaldin/data-analysis
A collection of data-analysis projects I've built over time! ✨⛏️
Last synced: 03 May 2026
https://github.com/devesh8423/machine_learning
Machine Learning practice projects, Jupyter notebooks, and datasets for learning regression, classification, and data analysis.
classification data-analysis data-science data-visualization jupyter-notebook machine-learning matplotlib ml-project numpy-library pandas python regression sckit-learn seaborn
Last synced: 03 May 2026
https://github.com/donmaruko/flask-data-analysis
Flask API for statistical calculations. Data analysis, cleansing, visualization, and manipulation. Documented by Swagger.
api api-rest data-analysis data-science data-visualization datascience flasgger matplotlib pandas seaborn sqlite wordcloud
Last synced: 03 May 2026
https://github.com/bpkaur/whats-in-a-name
Exploring dataset of first names of babies born in the US in order to uncover interesting stories
data-analysis datacamp numpy pandas python3
Last synced: 04 May 2026
https://github.com/mindlessmuse666/titanic-data-visualization
Проект по визуализации данных о пассажирах Титаника с использованием библиотек Python Matplotlib, Seaborn и Plotly.
data-analysis data-visualization matplotlib pandas plotly python seaborn titanic
Last synced: 04 May 2026
https://github.com/nickenshidqia/uber-new-york-data-analysis
Analyze Uber pickups on New York to get insight from this data
data-analysis data-analyst exploratory-data-analysis python
Last synced: 04 May 2026
https://github.com/fatihilhan42/the-office-eda
Data analysis study of my favorite sitcom, The Office (US).
data-analysis data-science data-visualization fatihilhan office python sitcom
Last synced: 04 May 2026
https://github.com/damisparks/become_data_analyst
Are you new to Data Analysis ? Here you will find simple notebook that will help through your journey. These are personal projects I work on and still working.
data data-analysis data-visualization matplotlib numpy pandas-tutorial
Last synced: 04 May 2026
https://github.com/aaaa-source/us-stock-market-analysis-and-prediction
US Stock Market Analysis and Prediction
artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks classification clustering data-analysis finance financial-analysis python
Last synced: 09 Jun 2026
https://github.com/mr-chang95/sf_data_visualization
In this personal project, I am interested in examining all of the active businesses in the San Francisco Bay Area while performing some simple data visualizations, mainly on categorical variables.
business data-analysis data-visualization jupyter-notebook pandas python san-francisco
Last synced: 04 May 2026
https://github.com/marionchaff/real-estate-price-prediction-france
Real estate price prediction using French public database DVF
data-analysis dvf-data machine-learning price-prediction python real-estate scikit-learn
Last synced: 04 May 2026
https://github.com/analitico-771/etf_analyzer
This is an An application that pulls and analyzes ETF data from a database
conda-environment data-analysis data-structures data-visualization database etf-investments fintech hvplot pandas-dataframe python quantitative-finance sqlalchemy
Last synced: 04 May 2026
https://github.com/hyperplasma/olympic-visualization-analysis
Multidimensional analysis and visualization of Olympic medals, economy, and happiness index.
data-analysis data-visualization matplotlib numpy pandas python wordcloud
Last synced: 04 May 2026
https://github.com/josewebdev2000/us-violent-crime-data-analysis
Analyzing Violent Crime in the United States of America from 1960 to 2019
data-analysis data-science data-visualization interactive-visualizations jupyter-notebook pandas plotly python
Last synced: 04 May 2026
https://github.com/jendives2000/regressions
Performing of a Linear Regression analysis to determine the strength of the relationship between the number of reviews and sales for a retail company.
data-analysis linear-regression pearson-correlation-coefficient regression
Last synced: 04 May 2026
https://github.com/ayushsiloiya619/brain-stroke-analysis
Data Analytics with Python
data-analysis matplotlib-pyplot python3 seaborn seaborn-python
Last synced: 05 May 2026
https://github.com/dhruvsrikanth/basic-data-science
A short Data Science Project I took up for fun! This is a data analysis based on a dataset I created to predict the distribution of wealth within an economy as well as several characteristics of each class within society!
analysis data-analysis data-pipeline data-science data-visualization machine-learning matplotlib pandas python seaborn sklearn
Last synced: 05 May 2026
https://github.com/celineboutinon/bookworms
OpenClassrooms Data Analyst 2022-2023 - Projet 6
apriori-algorithm data-analysis data-analytics data-visualisation dataframes matplotlib-pyplot mlxtend numpy pandas python scikit-learn scikit-posthocs scikitlearn seaborn statsmodels
Last synced: 05 May 2026
https://github.com/zafir100100/cancer-stage-prediction
This code predicts cancer data using various regression models, calculates their average R-squared scores, and prints the best model.
cross-validation data-analysis data-preprocessing decision-trees gradient-boosting linear-regression machine-learning-algorithms numpy pandas random-forest regression scikit-learn
Last synced: 05 May 2026
https://github.com/cicku/en.650.672
HW of EN.650.672
analytics data-analysis numpy pandas
Last synced: 05 May 2026
https://github.com/monish-nallagondalla/universal-bank
Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.
classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn
Last synced: 05 May 2026
https://github.com/aryar-06/linear-regression
A Python project demonstrating basic linear regression with gradient descent and matrix operations, alongside scikit-learn comparison.
data-analysis data-preprocessing educational-project gradient-descent linear-regression machine-learning python regression-algorithms scikit-learn
Last synced: 05 May 2026
https://github.com/zyna-b/insurance-cost-analysis-and-prediction
Medical insurance EDA and prediction: feature engineering, correlation analysis & Chi-square tests
adjusted-r-squared chisquare-test data-analysis data-science data-visualization eda exploratory-data-analysis linear-regression pandas r2-score sklearn statistical-analysis
Last synced: 05 May 2026
https://github.com/caesaredia/ymusic-project
Exploratory data analysis (EDA) of music streaming behavior in two fictional cities using Python, Pandas, and Jupyter Notebook. It explores user behavior, genre preferences, and listening patterns throughout the week.
data-analysis eda pandas python
Last synced: 05 May 2026
https://github.com/donmaruko/python-eda-toolkit
CLI-runned EDA with 30 commands utilizing text-related functions, statistical calculations, data visualization, and data manipulation.
data data-analysis data-science data-visualization matplotlib pandas scipy seaborn statistical-analysis statistics wordcloud
Last synced: 06 May 2026
https://github.com/ryuzen6/bangalore-real-estate-price-prediction
This is a Data Science Project which predicts the cost of Real Estate in Bangalore. Requirements: Jupyter Notebook (for Data Cleaning and creating the Linear Regression using various python libraries) , Pycharm (python IDE for creating Python Flask Server), Visual Studio Code (to create the UI with HTML, CSS and Javascript).
css3 data-analysis data-science html5 javascript jupyter-notebook machine-learning python3
Last synced: 06 May 2026
https://github.com/yashpaneliya/bank-loan-default-analysis
Analyze and understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default.
data-analysis loan-default-analysis matplotlib numpy pandas python
Last synced: 06 May 2026
https://github.com/ankitwalimbe/sentiment-analysis
Sentiment analysis of Amazon Fashion reviews using VADER and a baseline ML model (TF-IDF + SGDClassifier). Includes visualizations, reproducible notebook, and recruiter-ready documentation.
data-analysis machine-learning matplotlib nlp pandas python seaborn sentiment-analysis sklearn
Last synced: 06 May 2026
https://github.com/chaitanyac22/investment-analysis-for-an-asset-management-company
Data analysis to identify the best sectors, countries, and a suitable investment type for making investments.
business-analytics business-intelligence data-analysis data-cleaning data-insights data-manipulation data-preparation data-visualization decision-making finance python3 risk-management statistics
Last synced: 06 May 2026
https://github.com/harryrlk/data_analysis_showcase
This repository showcases my data analysis and visualization projects using Excel, Python, R, and Tableau. Some projects are under NDA, so key figures and specific numbers are not included, but brief overviews and methodologies are provided. Feel free to explore and contact me for further details.
data-analysis data-science data-visualization excel portfolio python r tableau
Last synced: 06 May 2026
https://github.com/abdelmajidlh/cours
Cours Data engineering et data analyse.
apache-spark big-data data-analysis data-engineering docker jupyter-notebook pyspark
Last synced: 06 May 2026
https://github.com/josepablodmg/python--linear-regression-advertising
A linear regression analysis to predict sales based on advertising spending across TV, radio, and newspaper channels. The project includes exploratory data analysis, model training, coefficient visualization, and residual analysis.
advertising data-analysis exploratory-data-analysis linear-regression machine-learning python regression scikit-learn visualization
Last synced: 06 May 2026
https://github.com/fbarffmann/home_sales
Analyzed 25,000+ home sales using PySpark and SparkSQL. Identified pricing trends by year built, home features, and view rating. Optimized query run-time by 70% using caching.
aws big-data data-analysis home-sales parquet pyspark python spark spark-sql sql
Last synced: 06 May 2026
https://github.com/suhas-005/jovian-data-analysis-course-assignment
These are my assignments for Data Analysis : Zero to Pandas course by Jovian.ai
data-analysis data-analytics numpy pandas python
Last synced: 07 May 2026
https://github.com/whis99/data_analysis_journey
A repositories of my data analysis projects.
data data-analysis data-analysis-python data-visualization dataset jupyter-notebook matplotlib python visualization
Last synced: 07 May 2026
https://github.com/devexpress-examples/winforms-pivot-change-the-field-value-header-appearance-backcolor
This example handles the CustomDrawFieldValue event to fill the header's color.
data-analysis dotnet pivot-grid-for-winforms winforms xtrapivotgrid-suite
Last synced: 07 May 2026
https://github.com/syarwinaaa09/modeling-car-insurance-claim-outcomes
a data analysis project on car insurance trends using Python and Jupyter Notebook
car-insurance classic-cars data-analysis data-science jupyter-notebook machine-learning matplotlib pandas python seaborn visualization
Last synced: 07 May 2026
https://github.com/jpgiant/gujaratrainfallanalysis_2021
Analysis about the rainfall that occurred in the districts of Gujarat state in 2021
data-analysis exploratory-data-analysis exploratory-data-visualizations matplotlib numpy pandas-python python
Last synced: 07 May 2026
https://github.com/pedrosfaria2/fugascomhelicoptero
Meu primeiro uso do Jupyter Notebook em um projeto
analise-de-dados data-analysis jupyter-notebook matplotlib pandas python
Last synced: 07 May 2026
https://github.com/vyjayanthipolapragada/genai_smart_retail_recommendation
GenAI Smart Retail is a recommendation system designed for retail environments. It provides personalized product recommendations to users based on product descriptions using a content-based filtering approach. The system leverages FastAPI for backend integration, allowing users to interact with the recommendation engine via an API. This project aim
content-based-recommendation data-analysis data-science data-visualization fastapi gen-ai instacart-data jupyter-notebook open-ai python3 retail scikitlearn-machine-learning stream
Last synced: 07 May 2026
https://github.com/blladerunner/customer-churn-dashboard
Customer Churn Dashboard — SQL + Python analytics project exploring customer retention patterns, churn rate by demographics and services, and key insights for telecom business strategy.
business-intelligence churn-analysis customer-retention dashboard data-analysis data-analytics data-science pandas powerbi python sql sqlite telecom
Last synced: 08 May 2026
https://github.com/danmadeira/algoritmos-estatistica-python
Demonstração de Algoritmos de Estatística em Python
algorithms data-analysis data-science python statistics
Last synced: 08 May 2026
https://github.com/phanchenh/adventureworkdataset-rfm-analysis-sqlproject
RFM Analysis Using SQL on the AdventureWorks Dataset (2011-2014)
business-analytics business-intelligence data-analysis mssql rfm-analysis sql
Last synced: 10 Jun 2026
https://github.com/0290192029/apartment-price-predictor
Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".
apartment-price-prediction apartments-for-rent api correios-api data-analysis feature-engineering feature-enginering linear-regression linear-regression-models mlops numpy prediction-model r seaborn
Last synced: 08 May 2026
https://github.com/ayushsiloiya619/online-food-orders-analysis
Data Analytics with Python
data-analysis data-visualization matplotlib pandas-dataframe python3 seaborn-python
Last synced: 08 May 2026