Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hugo-hattori/customer_profile_analysis
Data Analysis Project.
https://github.com/hugo-hattori/customer_profile_analysis
data-analysis data-analysis-python data-analytics jupyter jupyter-notebook pandas pandas-dataframe pandas-python plotly plotly-express plotly-io python
Last synced: 13 days ago
JSON representation
Data Analysis Project.
- Host: GitHub
- URL: https://github.com/hugo-hattori/customer_profile_analysis
- Owner: Hugo-Hattori
- License: mit
- Created: 2023-06-21T20:41:37.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-06T22:00:08.000Z (about 1 year ago)
- Last Synced: 2024-11-07T14:32:03.947Z (2 months ago)
- Topics: data-analysis, data-analysis-python, data-analytics, jupyter, jupyter-notebook, pandas, pandas-dataframe, pandas-python, plotly, plotly-express, plotly-io, python
- Language: Jupyter Notebook
- Homepage:
- Size: 37.1 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Customer Profile Analysis
This project's goal is to increase a company's revenue by identifying the
Ideal Customer Profile (ICP) also known as the most valuable customer for the company.To this purpose, each client presented in the database was given a score from
1 to 100, with 100 being the most valuable client and 1 the least valuable.### Packages used:
+ pandas
+ plotly.express
+ plotly.io## Importing the Database
First we need to import the database, visualize and process the data using
the pandas package. In this scenario the .csv file contains special
characters and is separated by semicolon instead of comma so the keyword
arguments "enconding" and "sep" are not default. Also, the dataframe
contains a column with empty values, so we need to drop it.https://github.com/Hugo-Hattori/Customer_Profile_Analysis/blob/c4ff758575da7633a1ee8035707a4df652a561c3/Customer_Profile_Analysis.py#L3-L7
## Data Processing
Using the DataFrame.info() method we can observe two major problems:
- The column "Salário Anual (R$)" is a Dtype object and not a Dtype int64;
- There're 35 entries where "Profissão" information is null, so these
are not very useful data.https://github.com/Hugo-Hattori/Customer_Profile_Analysis/blob/c4ff758575da7633a1ee8035707a4df652a561c3/Customer_Profile_Analysis.py#L12-L15
## Data Analysis
By using the DataFrame.describe() method we can see that the average
score achieved is around 52, so this will be our main benchmark.![image](https://github.com/Hugo-Hattori/Customer_Profile_Analysis/assets/136493140/6ba3b76d-cb23-4998-bbad-17d4cb81821a)
Using .histogram() method from plotly.express package we can perform a
graphic analysis, comparing the Score with the other parameters such as
Age (Idade) or Yearly Income (Salário Anual).![Captura de tela 2023-06-21 204934](https://github.com/Hugo-Hattori/Customer_Profile_Analysis/assets/136493140/f6ca3094-538c-4123-ae7c-bef6e65ef7d9)
![Captura de tela 2023-06-21 205007](https://github.com/Hugo-Hattori/Customer_Profile_Analysis/assets/136493140/00240d36-acf2-451a-b597-88dc1849d24e)
![Captura de tela 2023-06-21 205039](https://github.com/Hugo-Hattori/Customer_Profile_Analysis/assets/136493140/fbac0d3e-24d2-4f9b-bc0f-aa28d1766d2f)
![Captura de tela 2023-06-21 205057](https://github.com/Hugo-Hattori/Customer_Profile_Analysis/assets/136493140/6e3784a4-381e-4854-bb11-2e0c6735d51b)## Conclusion
Analysing the Age X Score, Profession X Score, Work Experience X Score
and Family Size X Score graphics we can conclude that the ICP is above
15 years old, works in the Entertainment Industry or is an Artist, has between
10 to 15 years of work experience, and has a family size no larger than 7.Note: this is a project developed for academic purposes, therefore the
data contained in "Clientes.csv" is fictitious and used only to learn Pandas and
Plotly packages applications.