https://github.com/arnabd64/amazon-bestselling-books

data-science eda jupyter-notebook matplotlib-pyplot python3

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/arnabd64/amazon-bestselling-books
Owner: arnabd64
Archived: true
Created: 2021-06-12T07:52:24.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2021-06-14T04:05:13.000Z (over 4 years ago)
Last Synced: 2025-03-04T17:50:31.766Z (9 months ago)
Topics: data-science, eda, jupyter-notebook, matplotlib-pyplot, python3
Language: Jupyter Notebook
Homepage:
Size: 582 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Amazon Top 50 best selling books - Exploratory Data Analysis

The dataset is obtained from [Kaggle](https://kaggle.com), to import the dataset onto your jupyter notebook or other environment, use the github link: https://raw.githubusercontent.com/arnabd64/amazon-bestselling-books/main/books.csv

### For importing in Python

```python

df = pandas.read_csv('https://raw.githubusercontent.com/arnabd64/amazon-bestselling-books/main/books.csv')

```

### For importing in R

```r

df <- read.csv('https://raw.githubusercontent.com/arnabd64/amazon-bestselling-books/main/books.csv')

```

## About

This dataset contains the Top 50 best-selling books from Amazon.com for every year from 2009 to 2019. Thus there are 550 entries. Fortunately there are __No missing data__ or __Duplicate entries__ in the dataset. There are 7 variables in total

| Variable    | Data Type | Description                                   |

|:----------- |:--------- | --------------------------------------------- |

| Name        | string    | Name of the Book                              |

| Author      | string    | Name of the Author                            |

| User Rating | float     | Average rating out of 5                       |

| Reviews     | integer   | Total number of reviews for the book          |

| Price       | integer   | Price of the book in USD                      |

| Year        | integer   | Year in which it was on Best Selling category |

| Genre       | string    | Genre of the Book (Fiction or Non-Fiction)    |

My objective is to perform __Exploratory Data Analysis__ using the basic packages available to us. In python, I used `pandas`, `matplotlib` and `seaborn` packages. All the work is done using Jupyter Notebook with python 3.9. The output file is a `.ipynb` which is easily viewable on Github.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/arnabd64/amazon-bestselling-books

Awesome Lists containing this project

README