Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dipeshgoyal013/salary-data-analysis
Salary Analysis according department and agency.
https://github.com/dipeshgoyal013/salary-data-analysis
analysis matplotlib numpy pandas salary sklearn-library
Last synced: 5 days ago
JSON representation
Salary Analysis according department and agency.
- Host: GitHub
- URL: https://github.com/dipeshgoyal013/salary-data-analysis
- Owner: dipeshgoyal013
- Created: 2024-08-03T10:56:16.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-08-03T11:23:27.000Z (3 months ago)
- Last Synced: 2024-08-03T12:32:42.953Z (3 months ago)
- Topics: analysis, matplotlib, numpy, pandas, salary, sklearn-library
- Language: Jupyter Notebook
- Homepage:
- Size: 765 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Salary-Data-Analysis
### Tools Used
Python Version: 3.7
Packages: pandas, numpy, sklearn, matplotlib, seaborn### Featuring Engineering
Creating New Column using Date Column for Better analysis
![image](https://github.com/user-attachments/assets/e2c9a198-1a46-4f2b-8370-ecc1c7f326b6)### EDA
I looked at the distributions of the data and the value counts for the various categorical variables. Below are a few highlights from the tables.
![image](https://github.com/user-attachments/assets/3a7e3003-b21b-4f47-bf90-ef04aa610659)
![image](https://github.com/user-attachments/assets/b1167697-fc25-4c4f-b391-8fc9906405f5)### Model Building
First, I transformed the categorical variables into dummy variables using encoding technique. I also split the data into train and tests sets with a test size of 25%.I tried Linear Regression models and evaluated them using Mean Squared Error.