Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/akash1070/data-science-virtual-internship-by-anz

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.
https://github.com/akash1070/data-science-virtual-internship-by-anz

data-analysis data-science predictive-analytics presentation-slides

Last synced: about 1 month ago
JSON representation

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.

Awesome Lists containing this project

README

        

# **Data Science Virtual Internship By ANZ**

Repository for all the code and reports for Data Analytics Virtual Internship Program at ANZ.

# Project Details

## Project:

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.

## Dataset Description:

The Dataset that was given to us is based on a synthesised transaction dataset containing 3 months’ worth of transactions for 100 hypothetical customers. It contains purchases, recurring transactions, and salary transactions.

The dataset is designed to simulate realistic transaction behaviours that are observed in ANZ’s real transaction data, so many of the insights we will gather will be genuine.

## Tools used:

**For data wrangling and visualization:** NumPy, Pandas, Matplotlib, Seaborn

**For predictive analytics:** scikit-learn

**For Reporting:** Google slides

## Tasks:

**Task 1:** Segmenting the dataset and drawing unique insights, including visualisation of the transaction volume and assessing the effect of any outliers.

**Task 2:** Exploring correlations between customer attributes, building a regression and a decision-tree prediction model based on your findings.

## Authors

- [@Akash Kumar Jha](https://github.com/Akash1070)

## Deployment

1. Importing Necessary Libraries
2. Load All Datasets
3. Data Cleaning
4. Data Analysis
5. Predictive Analysis

## Installation

To install the libraries used in this project. Follow the
below steps:

```bash
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

```

## Running Flask Api

To run tests, run the following command

```bash
python app.py
```

## πŸš€ About Me

Data Scientist Enthusiast | Petroleum Engineer Graduate | Solving Problems Using Data

# Hi, I'm Akash! πŸ‘‹

## πŸ”— Links
[![github](https://img.shields.io/badge/github-000?style=for-the-badge&logo=ko-fi&logoColor=white)](https://github.com/Akash1070)
[![linkedin](https://img.shields.io/badge/linkedin-0A66C2?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/akashkumar107/)

## Tech Stack

![Logo](https://businesstoys.in/assets/programs/full-stack-data-science-professional-program/tools.png)
## Other Me
πŸ‘©β€πŸ’» I’m interested in Petroleum Engineering

🧠 I’m currently learning Data Scientist | Data Analytics | Business Analytics

πŸ‘―β€β™€οΈ I’m looking to collaborate on Ideas & Data

## πŸ›  Skills
1. Data Scientist
2. Data Analyst
3. Business Analyst
4. Machine Learning

## Future Plans

⚑️ Looking forward to help drive innovations into your company as a Data Scientist

⚑️ Looking forward to offer more than I take and leave the place better than i found