https://github.com/sankaran-s2001/layoffs-sql-analysis
A complete SQL data cleaning and analysis project using MySQL to analyze global company layoffs from 2020-2023.
https://github.com/sankaran-s2001/layoffs-sql-analysis
data-science datacleaning eda kaggle layoffdata layoffs mysql sql
Last synced: about 1 month ago
JSON representation
A complete SQL data cleaning and analysis project using MySQL to analyze global company layoffs from 2020-2023.
- Host: GitHub
- URL: https://github.com/sankaran-s2001/layoffs-sql-analysis
- Owner: sankaran-s2001
- License: mit
- Created: 2025-09-04T15:39:41.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-10-08T04:43:42.000Z (9 months ago)
- Last Synced: 2025-10-08T06:24:11.408Z (9 months ago)
- Topics: data-science, datacleaning, eda, kaggle, layoffdata, layoffs, mysql, sql
- Homepage:
- Size: 413 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
Awesome Lists containing this project
README
# ๐ SQL Layoffs Data Analysis






A complete SQL data cleaning and analysis project using MySQL to analyze global company layoffs from 2020-2023.
## ๐ฏ What This Project Does
This project takes messy, real-world layoffs data and cleans it up using SQL, then finds interesting patterns and insights about which companies, industries, and locations were most affected by layoffs.
## ๐ About the Data
**Source**: [Layoffs Dataset from Kaggle](https://www.kaggle.com/datasets/swaptr/layoffs-2022)
- **Size**: 2,300+ layoff records
- **Time**: 2020-2023
- **Coverage**: Companies worldwide
- **Industries**: Tech, Finance, Retail, Healthcare, and more
## ๐งน What I Did - Data Cleaning
### Step 1: Remove Duplicates
- Found and removed duplicate records
- Used SQL window functions to identify copies
### Step 2: Fix Data Problems
- Cleaned company names (removed extra spaces)
- Fixed industry names (made "Crypto" consistent)
- Fixed country names (removed dots from "United States.")
- Changed date format from text to proper dates
### Step 3: Handle Missing Data
- Filled in missing industry info when possible
- Removed records that had no useful layoff numbers
## ๐ Key Findings
### ๐ข Top 5 Companies with Most Layoffs

### ๐ Top 5 Locations with Most Layoffs

### ๐ญ Top 5 Industries with Most Layoffs

### ๐ Biggest Single Layoff Event

### ๐ Companies That Shut Down Completely (100% Layoffs)

## ๐ ๏ธ SQL Skills Used
- **Data Cleaning**: Removing duplicates, fixing messy data
- **Window Functions**: ROW_NUMBER(), RANK(), SUM() OVER()
- **Joins**: Connecting tables to fill missing data
- **Date Functions**: Converting text to dates
- **Aggregation**: GROUP BY, SUM(), COUNT(), MAX()
- **CTEs**: Common Table Expressions for complex queries
## ๐ Project Files
```
๐ฆ layoffs-sql-analysis
โโโ ๐ README.md (This file)
โโโ ๐ data/
โ โโโ ๐ layoffs.csv (Original dataset)
โโโ ๐ sql/
โ โโโ ๐ data_cleaning.sql (Cleaning queries)
โ โโโ ๐ eda.sql (Analysis queries)
โโโ ๐ images/
โโโ ๐ท top_companies.png (Results screenshots)
โโโ ๐ท top_locations.png
โโโ ๐ท top_industries.png
โโโ ๐ท max_layoffs.png
โโโ ๐ท complete_shutdowns.png
```
## ๐ How to Run This Project
### What You Need
- MySQL installed on your computer
- MySQL Workbench (makes it easier)
### Steps
1. **Download** the files from this repository
2. **Open** MySQL Workbench
3. **Create** a new database called `world_layoff`
4. **Import** the `layoffs.csv` file as a table called `layoffs`
5. **Run** the `data_cleaning.sql` file first
6. **Run** the `eda.sql` file second to see the analysis
## ๐ก What I Learned
- How to clean messy real-world data
- Advanced SQL techniques for data analysis
- Finding business insights from raw data
- Documenting and presenting data projects
## ๐ Why This Project Matters
This project shows I can:
- โ
Take messy data and make it clean and usable
- โ
Write complex SQL queries to find insights
- โ
Present findings in a clear, understandable way
- โ
Work with real business data to solve problems
## โ๏ธ Contact
**Sankaran S**
[](https://github.com/sankaran-s2001) [](https://www.linkedin.com/in/sankaran-s21/) [](mailto:sankaran121101@gmail.com)
***
*This project is part of my data science portfolio, showing my SQL skills and ability to work with real-world data.*