Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kanugurajesh/amazon-scraper
This is a amzon scraper which is used to scrape amazon products
https://github.com/kanugurajesh/amazon-scraper
open-source python scraper selenium
Last synced: 3 months ago
JSON representation
This is a amzon scraper which is used to scrape amazon products
- Host: GitHub
- URL: https://github.com/kanugurajesh/amazon-scraper
- Owner: kanugurajesh
- Created: 2023-07-06T06:00:01.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-03T10:53:59.000Z (about 1 year ago)
- Last Synced: 2024-01-12T04:54:39.710Z (about 1 year ago)
- Topics: open-source, python, scraper, selenium
- Language: Python
- Homepage:
- Size: 62.8 MB
- Stars: 14
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Amazon Scraper
This is a amazon scraper build using selenium it can scrape product data from the website and write it to a csv file i have used selenium because it resembles a bit like human and it is a advanced testing frameworks which offers many advantages for scraping and i also used beautiful soup to extract data from the scraped data
# Project Setup
1.The code is same for both windows and linux you need to install seperate chromedriver for seperate environments you can install chromedriver at
https://chromedriver.chromium.org/downloads
The chromedriver file should be placed at root folder
2.setup a python environment using the commandpython -m venv myenv
3.activate the python virtual environment for windowsmyenv/Scripts/activate.ps1
and for linuxsource myenv/bin/activate
4.install the python modules in the python virtual environment using the commandpip install -r requirements.txt
# Usage
1. simply run the project in windows
python amazon_scraper.py
and in linuxpython3 amazon_scraper.py
1.To run project in windows run the commandpython amazon_scraper.py
2.To run the project in linux run the commandpython3 amazon_scraper.py
# Project Working
1.When you first run the project the selenium will scrape the product and write each product source code into html fiels in the products directory
2.In the next step the data is extracted from the html files and written to output.csv file# Project Files
1.
amazon-scraper.py
is the main python file which scrapes and writes data to the csv files
2.requirements.txt
is the file which contains all the modules required by the project to function without errors
3.automate.bat
is used to push code to github in windows
4.automate.sh
is used to push code to github in linux# Project Working Video
https://drive.google.com/file/d/1xS28dAszifytomf69G2MfKjBfMia_it8/view?usp=sharing
## Contributing
This project loves to accept contributions from everyone
## Technologies Used
- HTML
- CSS
- JavaScript## 🔗 Links
[![portfolio](https://img.shields.io/badge/my_portfolio-000?style=for-the-badge&logo=ko-fi&logoColor=white)](https://rajeshportfolio.me/)
[![linkedin](https://img.shields.io/badge/linkedin-0A66C2?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/rajesh-kanugu-aba8a3254/)
[![twitter](https://img.shields.io/badge/twitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/exploringengin1)## Authors
- [@kanugurajesh](https://github.com/kanugurajesh)
## Support
For support, you can buy me a coffee
## License
This project license is MIT LICENSE