https://github.com/metaphysics0/scrape-amazon-reviews
https://github.com/metaphysics0/scrape-amazon-reviews
Last synced: 10 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/metaphysics0/scrape-amazon-reviews
- Owner: Metaphysics0
- Created: 2025-05-28T18:27:07.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-28T18:27:12.000Z (about 1 year ago)
- Last Synced: 2025-09-01T13:49:21.582Z (10 months ago)
- Language: JavaScript
- Size: 3.91 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Amazon Review Extractor
A simple JavaScript tool to extract 4 and 5-star Amazon reviews from product pages and save them as structured JSON data.
## What it does
This script automatically extracts high-rating reviews (4-5 stars) from Amazon product review pages and organizes them into a clean JSON format with:
- Author name
- Review date
- Review title/header
- Review text
- Product details (size, color, etc.)
- Star rating
- Verification status
- Profile images
- And more metadata
## How to use
### For Non-Developers
1. **Go to an Amazon product's review page**
- Navigate to any Amazon product
- Click on the customer reviews section
- Make sure you can see the individual reviews on the page
2. **Open Browser Developer Tools**
- **Chrome/Edge**: Press `F12` or right-click → "Inspect"
- **Firefox**: Press `F12` or right-click → "Inspect Element"
- **Safari**: Enable Developer menu first, then press `Option + Cmd + I`
3. **Go to the Console tab**
- In the developer tools panel, click on the "Console" tab
4. **Copy and paste the script**
- Copy the entire JavaScript code from `amazon-review-extractor.js`
- Paste it into the console and press Enter
5. **Navigate through review pages**
- Go to the next page of reviews (click "Next" button)
- Run the script again by typing `extractHighRatingReviews()` and pressing Enter
- Repeat for as many pages as you want
6. **Download your data**
- Type `downloadReviewsAsJSON()` and press Enter
- A JSON file will automatically download with all your collected reviews
### For Developers
```javascript
// Initialize and extract reviews from current page
extractHighRatingReviews();
// Get all collected reviews
const allReviews = getAllReviews();
// Clear the collection
clearReviewCollection();
// Extract without adding to global collection
const currentPageOnly = extractHighRatingReviews(false);
// Download as file
downloadReviewsAsJSON('my-reviews.json');
```
## Output Format
The script generates a JSON array with this structure:
```json
[
{
"author": "John Doe",
"date": "Reviewed in the United States on January 15, 2025",
"review": "Great product! Highly recommend it.",
"header": "Excellent quality",
"metadata": {
"starRating": 5.0,
"productDetails": "Size: Large | Color: Blue",
"verifiedPurchase": "Verified Purchase",
"reviewId": "R1JWWMD5G5H2WV",
"avatarImage": "https://...",
"images": ["https://..."]
}
}
]
```
## Features
- ✅ **Automatic deduplication** - Won't add the same review twice
- ✅ **Multi-page support** - Collect reviews across multiple pages
- ✅ **Persistent storage** - Data persists as you navigate between pages
- ✅ **Error handling** - Gracefully handles missing or malformed data
- ✅ **Export functionality** - Download results as JSON file
- ✅ **Metadata rich** - Captures images, product details, and verification status
## Helpful Commands
While in the browser console:
```javascript
// Check how many reviews you've collected
console.log(`Total reviews: ${getAllReviews().length}`);
// View all collected reviews
console.log(getAllReviews());
// Start over with a fresh collection
clearReviewCollection();
// Download your data
downloadReviewsAsJSON();
```
## Use Cases
- **Market Research**: Analyze customer sentiment and feedback
- **Product Development**: Understand what customers love about products
- **Content Creation**: Gather authentic customer testimonials
- **Data Analysis**: Build datasets for sentiment analysis or ML projects
- **Academic Research**: Study consumer behavior and reviews
## Browser Compatibility
Works in all modern browsers:
- Chrome 60+
- Firefox 55+
- Safari 12+
- Edge 79+
## Important Notes
⚠️ **Legal and Ethical Use**: This tool is for personal research and analysis. Always respect Amazon's Terms of Service and robots.txt. Don't use this for commercial scraping or to violate any terms of service.
⚠️ **Rate Limiting**: Don't run this too aggressively. Navigate pages at a normal human pace to be respectful to Amazon's servers.
⚠️ **Data Privacy**: Be mindful of the personal information in reviews when sharing or publishing extracted data.
## Troubleshooting
**Script not finding reviews?**
- Make sure you're on an Amazon product review page
- Check that reviews are visible on the page
- Try refreshing the page and running the script again
**No data being collected?**
- Ensure you're looking at 4-5 star reviews (the script filters out lower ratings)
- Check the browser console for any error messages
**Downloaded file is empty?**
- Make sure you've run `extractHighRatingReviews()` at least once before downloading
- Check that `getAllReviews().length` returns a number greater than 0
## Contributing
Feel free to submit issues, fork the repository, and create pull requests for any improvements.
## License
MIT License - feel free to use this tool for your projects!