An open API service indexing awesome lists of open source software.

https://github.com/usk2003/income-testing-hypothesis

This repository analyzes data scientists' income using t-tests, providing insights into salary distributions and company ratings. Key features include data cleaning, statistical analysis, visualizations, and company suggestions for freshers. Practical advice helps guide career decisions effectively!
https://github.com/usk2003/income-testing-hypothesis

hypothesis-testing matplotlib pandas python statistical-analysis t-test

Last synced: 3 months ago
JSON representation

This repository analyzes data scientists' income using t-tests, providing insights into salary distributions and company ratings. Key features include data cleaning, statistical analysis, visualizations, and company suggestions for freshers. Practical advice helps guide career decisions effectively!

Awesome Lists containing this project

README

        

# ๐Ÿš€ Data Scientist Income T-Test Hypothesis Analysis

This project involves analyzing and testing hypotheses regarding data scientists' income using statistical methods, specifically t-tests. The dataset includes salary information from various companies, and we aim to provide actionable insights for freshers looking for job opportunities. ๐Ÿ’ผ

## ๐Ÿ“Š Project Overview

1. **Data Cleaning** ๐Ÿงน
- Cleaning salary columns (`Average`, `Lowest`, `Highest`).
- Filtering data based on frequency `/yr`.
- Removing outliers using the IQR method.

2. **Statistical Analysis** ๐Ÿงฎ
- Calculation of population and sample statistics (mean, standard deviation).
- Hypothesis testing:
- Two-tailed t-test.
- One-tailed t-tests (greater/less).

3. **Visualizations** ๐Ÿ“ˆ
- Normal distribution plots for population and sample salaries.
- Scatter plots: Rating vs. Average Salary for population and sample.

4. **Company Suggestions** ๐Ÿข
- Suggesting companies based on user-specified expected salary.

5. **Conclusions and Practical Advice** ๐Ÿ“
- Insights derived from hypothesis testing.
- Tips for freshers choosing companies based on salary and ratings.

## ๐Ÿ’ป Prerequisites

- Python 3.12
- Libraries: pandas, numpy, matplotlib, seaborn, scipy

## ๐ŸŒŸ Key Features

- Data cleaning and preprocessing.
- Statistical analysis using t-tests.
- Visualizations for clear data interpretation.
- Company recommendations based on salary expectations.

## ๐Ÿ“Š Visualizations

### 1. Normal Distribution Plot
A comparison of the population and sample average salary distributions.

### 2. Scatter Plot: Rating vs. Average Salary
Insights into how company ratings relate to salaries.

## ๐Ÿ† Results and Conclusions

- Statistical tests reveal whether sample salaries significantly differ from the population mean.
- Practical advice provided for freshers based on salary expectations and company ratings.

## ๐Ÿ’ก Suggested Companies for Freshers

Enter your expected salary during the script execution to get a list of recommended companies that meet your salary criteria. ๐Ÿ’ฐ

## ๐Ÿ“ฌ Contact

For any questions or issues, feel free to reach out via email: [[email protected]] โœ‰๏ธ