An open API service indexing awesome lists of open source software.

https://github.com/akash1070/data-science-virtual-internship-by-anz

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.
https://github.com/akash1070/data-science-virtual-internship-by-anz

data-analysis data-science predictive-analytics presentation-slides

Last synced: about 1 year ago
JSON representation

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.

Awesome Lists containing this project

README

          

# **Data Science Virtual Internship By ANZ**

Repository for all the code and reports for Data Analytics Virtual Internship Program at ANZ.

# Project Details

## Project:

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.

## Dataset Description:

The Dataset that was given to us is based on a synthesised transaction dataset containing 3 monthsโ€™ worth of transactions for 100 hypothetical customers. It contains purchases, recurring transactions, and salary transactions.

The dataset is designed to simulate realistic transaction behaviours that are observed in ANZโ€™s real transaction data, so many of the insights we will gather will be genuine.

## Tools used:

**For data wrangling and visualization:** NumPy, Pandas, Matplotlib, Seaborn

**For predictive analytics:** scikit-learn

**For Reporting:** Google slides

## Tasks:

**Task 1:** Segmenting the dataset and drawing unique insights, including visualisation of the transaction volume and assessing the effect of any outliers.

**Task 2:** Exploring correlations between customer attributes, building a regression and a decision-tree prediction model based on your findings.

## Authors

- [@Akash Kumar Jha](https://github.com/Akash1070)

## Deployment

1. Importing Necessary Libraries
2. Load All Datasets
3. Data Cleaning
4. Data Analysis
5. Predictive Analysis

## Installation

To install the libraries used in this project. Follow the
below steps:

```bash
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

```

## Running Flask Api

To run tests, run the following command

```bash
python app.py
```

## ๐Ÿš€ About Me

Data Scientist Enthusiast | Petroleum Engineer Graduate | Solving Problems Using Data

# Hi, I'm Akash! ๐Ÿ‘‹

## ๐Ÿ”— Links
[![github](https://img.shields.io/badge/github-000?style=for-the-badge&logo=ko-fi&logoColor=white)](https://github.com/Akash1070)
[![linkedin](https://img.shields.io/badge/linkedin-0A66C2?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/akashkumar107/)

## Tech Stack

![Logo](https://businesstoys.in/assets/programs/full-stack-data-science-professional-program/tools.png)
## Other Me
๐Ÿ‘ฉโ€๐Ÿ’ป Iโ€™m interested in Petroleum Engineering

๐Ÿง  Iโ€™m currently learning Data Scientist | Data Analytics | Business Analytics

๐Ÿ‘ฏโ€โ™€๏ธ Iโ€™m looking to collaborate on Ideas & Data

## ๐Ÿ›  Skills
1. Data Scientist
2. Data Analyst
3. Business Analyst
4. Machine Learning

## Future Plans

โšก๏ธ Looking forward to help drive innovations into your company as a Data Scientist

โšก๏ธ Looking forward to offer more than I take and leave the place better than i found