https://github.com/arnabd64/amazon-bestselling-books
https://github.com/arnabd64/amazon-bestselling-books
data-science eda jupyter-notebook matplotlib-pyplot python3
Last synced: 7 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/arnabd64/amazon-bestselling-books
- Owner: arnabd64
- Archived: true
- Created: 2021-06-12T07:52:24.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2021-06-14T04:05:13.000Z (over 4 years ago)
- Last Synced: 2025-03-04T17:50:31.766Z (9 months ago)
- Topics: data-science, eda, jupyter-notebook, matplotlib-pyplot, python3
- Language: Jupyter Notebook
- Homepage:
- Size: 582 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Amazon Top 50 best selling books - Exploratory Data Analysis
The dataset is obtained from [Kaggle](https://kaggle.com), to import the dataset onto your jupyter notebook or other environment, use the github link: https://raw.githubusercontent.com/arnabd64/amazon-bestselling-books/main/books.csv
### For importing in Python
```python
df = pandas.read_csv('https://raw.githubusercontent.com/arnabd64/amazon-bestselling-books/main/books.csv')
```
### For importing in R
```r
df <- read.csv('https://raw.githubusercontent.com/arnabd64/amazon-bestselling-books/main/books.csv')
```
## About
This dataset contains the Top 50 best-selling books from Amazon.com for every year from 2009 to 2019. Thus there are 550 entries. Fortunately there are __No missing data__ or __Duplicate entries__ in the dataset. There are 7 variables in total
| Variable | Data Type | Description |
|:----------- |:--------- | --------------------------------------------- |
| Name | string | Name of the Book |
| Author | string | Name of the Author |
| User Rating | float | Average rating out of 5 |
| Reviews | integer | Total number of reviews for the book |
| Price | integer | Price of the book in USD |
| Year | integer | Year in which it was on Best Selling category |
| Genre | string | Genre of the Book (Fiction or Non-Fiction) |
My objective is to perform __Exploratory Data Analysis__ using the basic packages available to us. In python, I used `pandas`, `matplotlib` and `seaborn` packages. All the work is done using Jupyter Notebook with python 3.9. The output file is a `.ipynb` which is easily viewable on Github.