Data analysis
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
- GitHub: https://github.com/topics/data-analysis
- Wikipedia: https://en.wikipedia.org/wiki/Data_analysis
- Last updated: 2026-07-01 00:07:23 UTC
- JSON Representation
https://github.com/mahmoudnamnam/superstore-analysis
This project explores the SuperStore dataset to uncover insights into sales, profit, and customer behavior. It identifies key trends, regional variations, and product performance, using data analysis and machine learning techniques to guide business strategy and optimize performance.
clustering data-analysis data-science data-visualization geopandas jupyter-notebook machine-learning numpy pandas plotly regression seaborn sklearn
Last synced: 12 Apr 2026
https://github.com/mostafa-ghorab/global-happiness-analysis
An analysis of global happiness rankings based on various factors like GDP, family support, health, and freedom from the World Happiness Report (2015-2017). This project provides data visualizations and statistical insights into how these factors influence happiness scores in different regions.
business-analysis data-analysis data-visualization matplotlib numpy pandas python seaborn
Last synced: 12 Apr 2026
https://github.com/soumya-thoutam/covid-19-impact-on-u.s.-states-and-colleges
Covid-19 analysis and impact on United States Colleges and States using SQL and Tableau.
covid-19 dashboard data-analysis data-visualization dataset sql sql-server tableau
Last synced: 04 Sep 2025
https://github.com/stoll-jonathan/sorting_algorithm_analyzer
C++ program which analyses the performance of different sorting algorithms on a dataset of random numbers
bubble-sort data-analysis insertion-sort merge-sort sorting-algorithms
Last synced: 01 Apr 2025
https://github.com/yash-3-bit/online-sales-analysis
Project-Merging the different months datasets and performing the data cleaning ,Analysis and Visualization
data-analysis data-visualization pandas-library
Last synced: 27 Mar 2025
https://github.com/robson-python/academic-performance
Project to evaluate students' academic performance.
csv-import data data-analysis data-science jupyter-notebook machine-learning matplotlib pandas python scikit-learn seaborn vscode
Last synced: 12 Apr 2026
https://github.com/ernanej/data-science-dca0131
Files, developed throughout the 2024.1 semester of the Data Science discipline taught at the Federal University of Rio Grande do Norte by the Department of Computer Engineering and Automation (DCA). 📚
big-data data-analysis data-science ia
Last synced: 30 Mar 2025
https://github.com/apoorvalal/misc_stata_ados
Misc Utility programs in Stata.
data-analysis stata stata-command
Last synced: 04 Feb 2026
https://github.com/hemangsharma/breast-cancer-patient-dashboard
This interactive Streamlit dashboard visualizes insights from the SEER Breast Cancer Dataset (2006-2010)
data-analysis streamlit streamlit-dashboard streamlit-webapp
Last synced: 05 May 2026
https://github.com/tanaybhadula/twitter-trends-dashboard
An interactive dashboard to visualizes data on current Twitter trends by country and globally. Collects data of over 60 countries using the python Tweepy library, processed it,and visualized it in the form of bar chart and pie chart using the Plotly Dash framework.
dash dashboard data-analysis data-visualization plotly python trends twitter
Last synced: 31 May 2026
https://github.com/theveryhim/dimensionality-reduction-and-clustering
Simple ML-like data analysis and processing.
autoencoder clustering data-analysis dimensionality-reduction pca
Last synced: 10 Sep 2025
https://github.com/mpoojithavigneswari/bangalore-house-price-prediction
This project involves creating a website that predicts Bangalore house prices with 94.65% accuracy using a machine learning algorithm.
data-analysis data-science flask-server machine-learning matplotlib numpy pandas python scikit-learn seaborn
Last synced: 12 Apr 2026
https://github.com/jacksonrakena/lcg-toolkit
Simulator and plotter for linear congruential generator (LCG) functions in Python
borland congruence congruent congruential data-analysis data-science generation generator lcg linear linear-congruential-generator numerical-recipes randu randu-function randu-generator randu-rng rng
Last synced: 31 Aug 2025
https://github.com/skivhisink/econometricnsu
Семестровый магистерский курс по эконометрике на первом курсе магистратуры экономического факультета НГУ
data-analysis econometrics economics education nsu r
Last synced: 09 Apr 2025
https://github.com/quantumudit/groceries-basket-analysis
This project performs market basket analysis using Power BI and Python to reveal associations between grocery items. It involves transforming raw transaction data into a processed dataset, creating interactive Power BI reports, and generating key insights through Python, enabling data-driven decision-making.
data-analysis data-visualization pandas powerbi python
Last synced: 12 Apr 2026
https://github.com/bpkaur/exploring-the-evolution-of-linux
This project explores the evolution of the Linux kernel by finding top 10 contributors and visualization of commits over the years.
data-analysis data-science datacamp ipynb-jupyter-notebook python3
Last synced: 21 Feb 2026
https://github.com/abdelrahmanbayoumi/titanic-machine-learning-from-disasters
Knowing from a training set of samples listing passengers who survived or did not survive the Titanic disaster, can our model determine based on a given test dataset not containing the survival information, if these passengers in the test dataset survived or not.
data-analysis data-science data-visualization machine-learning pandas
Last synced: 09 Apr 2025
https://github.com/borjamome/soho_cholera
Cholera deaths in the Soho District (London)
data-analysis data-visualization london r
Last synced: 04 Sep 2025
https://github.com/grooviter/tablesaw
Java dataframe and visualization library
data-analysis dataframe java visualization
Last synced: 28 Mar 2025
https://github.com/malk97sc/data_science
Data Science Projects
data-analysis data-science data-visualization
Last synced: 20 Jun 2026
https://github.com/sreekar0101/electric-vehicle-market-growth-and-incentive-impact-analysis-dashboard
About This project involves the development of a comprehensive Tableau dashboard to analyze the growth and market dynamics of electric vehicles (EVs). The dashboard reveals key insights, including a 20% increase in EV adoption over five years, the dominance of Battery Electric Vehicles (BEVs) which make up 60% of the market
data-analysis data-visualization tableau-desktop
Last synced: 07 Jan 2026
https://github.com/kernix13/github-readme-seo-analysis
A Jupyter Notebook GitHub README and Repo SEO Analysis to determine what makes a repo rank in the SERPS
accessibility data-analysis readme seo seo-analysis
Last synced: 29 May 2026
https://github.com/yanny-alt/competitor-sales-analysis-in-power-bi
This project aims to analyze competitor sales for a fictional manufacturing company, Sintec, using Power BI. The focus is on integrating, cleaning, and modeling data from multiple sources to generate insightful reports on company and competitor performance.
data-analysis powerbi sales-analysis
Last synced: 07 Jan 2026
https://github.com/edprice25/us-states-analysis
Presents a series of visualizations for folks looking to relocate to more affordable areas in the US. Click on my link below to see a full analysis.
data-analysis jupyter-notebook matplotlib pandas python us-states
Last synced: 04 Jul 2025
https://github.com/kseniatyschuk/excel-data-matcher
Compare and match Excel files via a simple Python GUI
automation data-analysis etl excel gui pandas python3 tkinter
Last synced: 23 Apr 2025
https://github.com/cassandrajm/reddit-dashboard
INTERACTIVE DASHBOARD: Analyzing Political Discourse on Reddit: A Multi-Faceted NLP Approach to Toxicity, Bias, and Political Stance
capstone data data-analysis data-science politics python reddit
Last synced: 09 Apr 2025
https://github.com/ehsan-behzadi/online-retail-data-analysis-and-preprocessing
This project analyzes and preprocesses the Online Retail dataset to uncover insights into customer purchasing behaviors, sales trends, and product performance. It includes data cleaning, exploration, and visualization, with the goal of enhancing understanding of online retail dynamics.
cohort-analysis data-analysis data-cleaning data-exploration duplicate-detection exploratory-data-analysis-eda feature-encoding feature-engineering handling-missing-values online-retail outlier-detection preprocessing trends-visualization visualization z-score-method
Last synced: 16 Apr 2026
https://github.com/nilayhangarge/data-analysis-with-python
This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.
data-acquisition data-analysis data-analytics data-binning data-cleaning data-engineering data-fundamentals data-insights data-integration data-preprocessing data-science data-wrangling numpy pandas python
Last synced: 12 Apr 2026
https://github.com/noorulhudaajmal/customer-segmentation-analysis
Customer segmentation and analysis of purchasing behaviour
cluster-analysis customer-segmentation data-analysis
Last synced: 07 Oct 2025
https://github.com/giatraskon/clustering_algorithms_analytical_and_computational
Analytical and computational exploration of clustering algorithms, focusing on k-means and k-medians, with MATLAB implementations and synthetic dataset analyses.
clustering computational-mathematics data-analysis data-science data-visualization k-means k-means-clustering k-median-clustering k-medians k-medoids k-medoids-clustering machine-learning matlab noise-robustness numerical-methods outlier-detection possibilistic-clustering-algorithms statistical-analysis synthetic-data unsupervised-learning
Last synced: 21 Mar 2025
https://github.com/chinmayee4/vrinda_store_data_analysis
Analyzed Data By Creating Interactive Dashboard Using MS Excel
data-analysis data-cleaning data-visualization excel-dashboard pivot-tables power-query
Last synced: 07 Jan 2026
https://github.com/jbalooshie/election_analysis
A Python script built to analyze specific election's results, and be re-purposed to analyze the results of other elections. The script provides you with different breakdowns of the vote based on candidate and county,
data-analysis data-science elections python
Last synced: 09 Apr 2025
https://github.com/krypten/nycsubwayturnstileweatheranalysis
Analyzing the NYC Subway Dataset
data-analysis machine-learning machinelearning python
Last synced: 01 Sep 2025
https://github.com/wisdom-osborn/data-analytics-course-online-
🔍 Data Analytics with Python — Hands-on Course Materials Jupyter notebooks, projects, and datasets based on the freeCodeCamp Data Analysis with Python certification. Learn NumPy, Pandas, data cleaning, and visualization through real-world examples
data data-analysis data-science data-visualization freecodecamp numpy pandas pandas-dataframe project python
Last synced: 19 Apr 2026
https://github.com/rosanafss/predictive-analytics-for-business-nanodegree
Predictive Analytics for Business Nanodegree
alteryx alteryx-server alteryx-workflow clustering data-analysis data-visualization datacleaning experiment filtering join modeling prediction preparation randomized regression segmentation summarization tableau testing workflow
Last synced: 04 Feb 2026
https://github.com/bkataru/physics-e.e
Project repository for IB physics extended essay. Topic: Predictive data modeling of a variable binary star’s brightness over a period of time using astrostatistics.
astrometry astronomical-algorithms astronomical-images astronomy astrophotography astrostatistics data-analysis data-science data-visualization modeling physics polynomial-regression regression-analysis
Last synced: 09 Apr 2025
https://github.com/saroshfarhan/dublin_pedestrian_data_analysis
Pedestrian's footfall data analysis for the city of Dublin
data-analysis data-visualization r-programming
Last synced: 07 Jan 2026
https://github.com/rohitblaze10/netflix_analysis_using_tableau
The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.
data data-analysis data-science data-visualization netflix tableau
Last synced: 04 Feb 2026
https://github.com/farukalamai/rainfall-prediction-and-forecasting
rainfall-prediction-and-forecasting
arima-forecasting data-analysis data-visualization deep-learning fb-prophet lstm-neural-network machine-learning rainfall-prediction stationary-data statistical-modeling time-series-analysis time-series-forecasting xgboost
Last synced: 12 Jun 2025
https://github.com/saigeethika05/global-connect
International Student Engagement Platform
data-analysis figma prototyping ui-design ux-design wireframes
Last synced: 04 Jul 2025
https://github.com/astropenguin/optimap
Optimized integrated intensity map method for spectral cubes
astronomy data-analysis data-science python python3 radio-astronomy spectral-cubes
Last synced: 09 Apr 2025
https://github.com/bhaskarbharati/ibm-datascience-hands-on-lab
This is the basic hands-on exercise using Jupyter Notebook. This lab is done in the process of learning course Tools For Data Science | IBM
data-analysis data-science data-visualization datawrangling eda machine-learning
Last synced: 23 Apr 2025
https://github.com/giog97/find_similar_tables_on_pubtables-1m
Find similar tables on the PubTables-1M dataset
data-analysis data-visualization datamining dm tables
Last synced: 09 Apr 2025
https://github.com/victorlcastro-dsa/pbl-datacamp
This repository features projects from DataCamp's Project-Based Learning (PBL) courses, showcasing practical applications of data analysis, machine learning, and visualization. Explore real-world datasets and interactive results that highlight the skills gained through hands-on learning.
data-analysis data-science data-visualization datacamp-projects hypothesis-testing machine-learning project-based-learning
Last synced: 30 Jun 2026
https://github.com/ankitmishralive/machinelearning
Continuously deep diving in understanding & advancing my expertise in Machine Learning through ongoing education and hands on experience with practical learning.
artificial-intelligence data-analysis data-cleaning data-gathering machine-learning machinel-learning-algorithms matplotlib numpy pandas python seaborn
Last synced: 22 Mar 2025
https://github.com/rachit1084/sql-practice-ankit-bansal
Personal SQL problem-solving practice based on Ankit Bansal's YouTube series, with logic-driven solutions for analyst prep.
analytics data-analysis data-analyst interview-preparation logical-reasoning postgresql sql sql-practice
Last synced: 04 Jul 2025
https://github.com/lukeskywalkerii/website
data-analysis data-visualization powerbi python r sql statistics
Last synced: 12 Apr 2026
https://github.com/francois-lenne/eletric_vehicle_usa
the project is purely educational the main goal is to use fabric
data-analysis data-engineering delta-lake fabric jupyter-notebook pyspark python spark
Last synced: 12 Apr 2026
https://github.com/faysalalmahmud/bd-med-professional-analysis
Analysis of healthcare professionals in Bangladesh through web scraping, data processing, and interactive visualization.
data-analysis data-visualization jupyter-notebook python scraper selenium selenium-webdriver tableau
Last synced: 04 Sep 2025
https://github.com/soajala/shopify-sales-analysis-powerbi
End-to-end Power BI dashboard project analyzing Shopify sales data with real-time metrics, DAX, and business insights.
business-intelligence data-analysis data-visualization dax interactive-dashboard powerbi sales-analysis shopify
Last synced: 05 Sep 2025
https://github.com/muhammed-fazal/student-success-and-early-intervention-analytics-system
To consolidate scattered student performance records into a unified Data Warehouse in SQL Server. Engineer an Interactive Power BI dashboards that visualize academic trends, identifying student performance and implement predictive analytics.
analysis analytics dashboard data data-analysis data-engineering data-science data-visualization database etl etl-pipeline power-bi powerbi python sql sql-server
Last synced: 29 May 2026
https://github.com/shridhar1504/loan-classification-datascience-project
This project uses machine learning algorithms to predict the classification of loan status. The dataset is loaded and some transformation is done using SQL for getting a proper dataset with some valid informations.
classification data-analysis data-cleaning data-science data-visualization eda loan-prediction loan-status machine-learning predictive-modeling sql supervised-learning
Last synced: 09 Apr 2025
https://github.com/noeldevelops/stem-degrees-analysis-cpp
C++ Data Analysis, I/O - takes an external data file for processing, performs some statistical analysis, and displays the results in the console
Last synced: 29 May 2026
https://github.com/wo0fle/sfrcp
The program used for a research study I conducted: "Comparison of Star Formation Rate in Spiral versus Elliptical Galaxies."
astronomy astropy data-analysis galaxy jupyter-notebook python research research-project
Last synced: 03 Apr 2025
https://github.com/ymorsi7/caliwageanalysis
California employment and wage analysis on data from the past decade.
data-analysis data-science ipynb jupyter-notebook
Last synced: 21 Jan 2026
https://github.com/trivediayush/analysis-work
anayltics business-analytics data-analysis excel excel-dashboard powerbi powerbidashboard
Last synced: 04 Feb 2026
https://github.com/LipunKumarDalai/Youtube-Analysis
A Simple DataAnalysis Project On Youtube-Data.
apache-superset beautifulsoup bootstrap5 data-analysis data-visualization django html jupyter-notebook postgresql-database python scraping selenium-webdriver sqlite-database youtube-api
Last synced: 30 Dec 2025
https://github.com/angchekar28/valorant-gameplay-analysis
This project analyzes Valorant gameplay data to understand key factors affecting match outcomes. It compares various machine learning models to predict player performance, rank classification, and match success.
data-analysis data-science data-visualization exploratory-data-analysis jupyter-notebook machine-learning python
Last synced: 12 Apr 2026
https://github.com/wtmcgrew/sql-credit-risk-analysis
Credit Risk Analysis using SQL & Excel – Approval trends by FICO, DTI, PTI, LTV, and delinquencies.
case-study credit-risk data-analysis financial-analysis loan-applications portfolio-project sql sqlite underwriting
Last synced: 04 Jul 2025
https://github.com/sarthakagg29/sql-share-trading-analysis
Analysis of share trading transactions using SQL. Includes table setup, sample data, and a variety of queries to answer typical business questions about stocks and trading.
data-analysis dbeaver portfolio postgresql share-market sql
Last synced: 04 Jul 2025
https://github.com/camara94/data_analyse_series_temporelles
Dans ce tutoriel, nous allons répondre aux questions suivantes: 1. Lire les données Microsoft à l'aide du package **Pandas Data reader** 2. Obtenez le **prix maximum** de l'action de **2017 à 2022** 3. Quelle est la **date du cours le plus élevé** de l'action ? 4. Quelle est la **date du cours le plus bas** de l'action ?
data-analysis data-analysis-python data-science data-structures-and-algorithms data-visualization serie series-forecasting
Last synced: 09 Apr 2025
https://github.com/smoeding/jmeterplugin-datasketches
A JMeter listener using DataSketches to estimate response time quantiles and histograms
data-analysis jmeter jmeter-listeners jmeter-plugin
Last synced: 06 Mar 2025
https://github.com/shrinidhi857/simpledataanalysisonstartups
The Indian startup ecosystem has experienced remarkable growth over the past decade, becoming a hotbed of innovation and entrepreneurship. In this data analysis we are segregating fields ,finding new insights.
data-analysis data-science data-visualization indian-startups
Last synced: 17 Sep 2025
https://github.com/doughtnerd/pod-old
Read and write Excel data
data data-analysis excel poi-library workbook
Last synced: 21 Jan 2026
https://github.com/anudeepkaddala/bankds
This repository contains a Python-based solution for cleaning, matching, and formatting bank data. The primary goal is to match banks from two datasets based on their names and associate each bank with its respective asset size. The final output is a cleaned dataset with asset sizes in Indian-style currency format.
data-analysis data-science fuzzy-matching pandas python
Last synced: 12 Apr 2026
https://github.com/nsandoya/python_scrp_project
This is a tool specially made for Dipaso ecommerce website. You can extract data from there, analyze it and see keywords, brands, and categories frecuency, prices distribution and other market tendencies as well —all in a group of friendly stadistic tables and graphics (exported from a Jupyter notebook) :)
beautifulsoup4 data data-analysis jupyter-notebook pandas python3
Last synced: 28 Apr 2026
https://github.com/fbarffmann/python-challenge
Automated financial and election data analysis using Python. Cleaned and transformed large CSV datasets, calculated key business metrics, and generated automated reports for stakeholders.
automation csv data-analysis data-cleaning election-analysis financial-analysis python reporting
Last synced: 24 Apr 2025
https://github.com/sumit0ubey/internship
This repository showcases the tasks and projects I completed during various internships. It includes work across diverse domains such as: Data Analysis: Exploratory data analysis, data visualization, and insights generation using Python and libraries like Pandas, Matplotlib, and Seaborn. Backend Development: Designing and implementing RESTful API
backend-development data-analysis python-developer
Last synced: 05 Sep 2025
https://github.com/itskshitija/lego-set-explorer
As a part of the Maven Analytics Lego challenge, I developed an interactive Power BI dashboard exploring the evolution of LEGO sets from 1970 to 2022.
data-analysis data-science data-visualization dataanalysis dataset powerbi powerbi-desktop powerbi-report
Last synced: 12 Jun 2025
https://github.com/fbarffmann/nosql-challenge
Analyzed 28,000+ UK restaurant records using MongoDB and PyMongo. Queried hygiene scores, location data, and customer ratings.
data-analysis data-cleaning database-analysis json mongodb nosql pymongo python restaurant-data
Last synced: 13 Apr 2026
https://github.com/avratanubiswas/fluorpenplugin
A matlab user interface for analysing OJIP curve datasets from FluorPen instrument. That is, serving as an additional plug in for "quick categorical analysis".
data-analysis fluorpen ojip-curve
Last synced: 18 Mar 2026
https://github.com/fbarffmann/sqlalchemy-challenge
Built a Flask API with SQLAlchemy to analyze and visualize Hawaii climate data. Automated data extraction and developed database queries for temperature and precipitation insights.
api climate-data data-analysis data-visualization flask orm python sql sqlalchemy sqlite
Last synced: 13 Apr 2026
https://github.com/alinenog/desenvolve_gb_2022
Formação Desenvolve 2022 do Grupo Boticário na área de dados
data-analysis data-science googlesheet machine-learning numpy pandas python
Last synced: 13 Apr 2026
https://github.com/hazim-hf/data-science
This course covers basic data science principles, Python programming, and the concept of big data and its types. It explores algorithms, methods, and analyses in data science with practical Python examples. Additionally, it highlights current data technologies for storing and archiving.
data-analysis data-wrangling time-series
Last synced: 04 Jul 2025
https://github.com/nullthefirst/py-notebooks
Jupyter Notebooks holding Data Science projects
data-analysis data-science data-visualization datasets jupyter-notebooks python
Last synced: 26 Apr 2026
https://github.com/ironlegion88/media_bias
An end-to-end NLP pipeline to analyze ideological bias in online news media during elections. Uses sentiment analysis, topic modeling (LDA/NMF), and NER to quantify media framing.
data-analysis machine-learning media-bias nlp nltk political-science python scikit-learn sentiment-analysis spacy topic-modeling
Last synced: 13 Apr 2026
https://github.com/zkan/python-for-data-scientists
Python for Data Scientists
data-analysis data-science data-scientists machine-learning pandas python
Last synced: 13 Apr 2026
https://github.com/lexiortiz/advanced-data-analytics
Structured learning notes, code snippets, and key takeaways from the Google Advanced Data Analytics Professional Certificate. Serves as a personal reference for reinforcing concepts and as a resource for others on a similar learning journey.
data data-analysis data-engineering google python-3 sql
Last synced: 29 May 2026
https://github.com/darrenjolson/pba-analysis-app
Data analysis and visualization tool for professional bowling tournaments, predicting performance across different oil patterns and venues.
bowling data-analysis data-visualization flask pba predictive-analytics python reactjs sports-analytics
Last synced: 13 Apr 2026
https://github.com/nurulashraf/polynomial-regression-manufacturing
A Python project implementing polynomial regression to analyse and predict manufacturing-related data. Features include data preprocessing, model training, and visualisation of results. Ideal for exploring machine learning applications in manufacturing process optimisation.
data-analysis data-visualization machine-learning manufacturing polynomial-regression predictive-modeling process-optimization python regression-models scikit-learn
Last synced: 16 Apr 2026
https://github.com/analysisbyvivek/Road-Accident
Analyzes road accident patterns, exploring factors like lighting, weather, speed limits, time of day, and road conditions to uncover trends in severity and frequency.
data-analysis data-visualization eda jupyter-notebook kaggle tableau-public
Last synced: 29 Jan 2026
https://github.com/analysisbyvivek/Crime-data
Analyzes crime patterns across different areas, exploring factors such as crime type, weapon usage, demographic influences, and geographic distribution to uncover trends in frequency, correlations, and hotspots.
apache-superset data-analysis eda jupyter-notebook python
Last synced: 29 Jan 2026
https://github.com/ashleydavis/brisjs-data-analysis-talk
Code for my talk to BrisJS on data analysis in JavaScript
charting data-analysis data-visualization data-viz javascript node node-js nodejs visualization
Last synced: 25 Mar 2025
https://github.com/borjamome/accidentes_madrid
Análisis de Accidentes en Madrid en SQL (2023)
accidentes-coche data-analysis madrid sql
Last synced: 17 Jan 2026
https://github.com/parthds02/e-commerce-data-analysis-with-python
This project focuses on analyzing an e-commerce dataset using Python. The goal is to derive meaningful insights through exploratory data analysis (EDA) and uncover trends and patterns that can drive business decisions.
data-analysis ecommerce exploratory-data-analysis jupyter-notebook pytho sales-analysis visualization
Last synced: 13 Jun 2025
https://github.com/amoghkori/working-with-apache-spark-mllib
Implemented Apache Spark MLLib to analyze a large car dataset, predict car selling prices, and gain insights into the car market.
amazon-web-services data-analysis data-visualization exploratory-data-analysis linear-regression machine-learning model-selection pyspark python random-forest sagemaker spark
Last synced: 13 Apr 2026
https://github.com/nature40/casestudies
Case studies for testing the functionality of database systems, sensors, etc
casestudies data-analysis data-visualization database
Last synced: 02 May 2026
https://github.com/ireneflorez/nypd-mvc
Analysis of NYPD Motor Vehicle Collisions
basemap data-analysis folium jupyter-notebook matplot pandas python
Last synced: 08 May 2026
https://github.com/hassanislam463/data-cleaning-and-modelling-top-5-categories-analysis-forage
This project involves cleaning, merging, and analyzing datasets to identify the top 5 performing categories based on aggregate popularity scores. It includes cleaned datasets, a final merged dataset, visualizations, and a presentation summarizing the tasks and results. Tools used: Microsoft Excel, Python, and PowerPoint.
data-analysis data-visualization microsoft-excel
Last synced: 07 Jan 2026
https://github.com/sco1/xbmini-py
Python Toolkit for the GCDC HAM
data-analysis data-visualization python python3
Last synced: 07 May 2025
https://github.com/ljadhav25/data-engineering-poc
This repository contains a beginner-level Data Engineering Proof of Concept (POC) project designed for practice. The objective is to provide hands-on experience with data engineering concepts, including data extraction, transformation, loading (ETL), and basic data analysis. This project is ideal for those looking to build foundational skills in da
data-analysis etl matplotlib numpy pandas python
Last synced: 13 Apr 2026
https://github.com/cezlul/analyse-ventes-immobilier
Solution ML d'analyse immobilière parisienne : classification automatique appartements vs commerces (K-means, 91%) et prédiction prix (régression, R²=0.98) sur 26K transactions. Valorise portefeuille 169M€ avec recommandations stratégiques data-driven.
data-analysis jupyter-notebook machine-learning matplotlib numpy pandas python sklearn
Last synced: 13 Apr 2026
https://github.com/damiieibikun/web-scrapping-and-python-data-visualization-on-top-500-movies-imdb
Web Scrapping and Python Data visualization on Top 500 movies IMDb
beautifulsoup4 data-analysis data-visualization matplotlib-pyplot numpy pandas plotly-express python requests seaborn web-scraping
Last synced: 13 Apr 2026
https://github.com/extwiii/datascience-jhu
Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera
biostatistics data-analysis data-science linear-regression multivariate-regression r r-programming toolbox visualization
Last synced: 05 Jul 2025