An open API service indexing awesome lists of open source software.

https://github.com/raminmammadzada/ruby-web-scraper

A web scraper for helping individuals or companies to find out which kind of products are recommended more than others in specific category.
https://github.com/raminmammadzada/ruby-web-scraper

rspec ruby scraper webscraper

Last synced: about 2 months ago
JSON representation

A web scraper for helping individuals or companies to find out which kind of products are recommended more than others in specific category.

Awesome Lists containing this project

README

        

![](https://img.shields.io/badge/Microverse-blueviolet)

# Microverse Ruby Capstone Project -> Custom Web Scraper

This is a fashion follower application for helping individuals or companies to find out which kind of products are recommended more than others in the category if the interest.
The project is a web scraper and uses [kimurai](https://github.com/vifreefly/kimuraframework) gem to fetch data from the Single Page Application.
By default, the script is implemented brings the top 5 reviewed products from the specific category in the Trendyol.com.
Users can also specify price ranges from the command line interface if needed.
Both the category and the price range can be given from the command line interface.
The category can be composed of several words, but it must be given in the Turkish language.

- [x] Milestone 1 - CLI user interface is implemented.
- [x] Milestone 2 - Implement logic to retrieve data from the webpage and store them in a JSON file.
- [x] Milestone 3 - Update codebase to follow OOP principles
- [x] Milestone 4 - Scraping data from Single Page Application
- [x] Milestone 5 - Rspec tests

## Video Presentation

- https://www.loom.com/share/7d2f67baa3fa4efcb17f9feaeb6d6960

## Rules To Run

- The category can be composed of several words, but it must be given in the Turkish language. You can see the examples below:
> If the user enters ```hp bilgisar``` (computer means "bilgisayar" mean in Turkish) and 300-4000 as a price range, then it will search all computers with the hp brand that costs between 300 and 4000 Turkish Lira.

> If the user enters ```Nike siyah erkek ayakkabi```, it will search for ```black Nike shoes for men```.

## Built With

- Kimurai (it depends on [nokagiri](https://github.com/sparklemotion/nokogiri) gem)
- Ruby
- Rspec
- Rubocop

## Getting Started

To get a local copy up and running follow these simple example steps.

### Install
You can run those functions in your own local environment.
To run, you need to install RUBY on your computer. For windows, you can go to [Ruby installer](https://rubyinstaller.org/) and for MAC and LINUX you can go to [Ruby official site](https://www.ruby-lang.org/en/downloads/) for instructions on how to install it.

Then you can clone the project by typing ```git clone https://github.com/RaminMammadzada/ruby-web-scraper.git```

### Dependencies
You must install the dependendies by bundling the Gemfile:
- First fo the root of the project by typing ```cd ruby-web-scraper```
- Go to the ```feature``` branch by typing ```git checkout feature```
- ```gem install bundler```
- ```bundle update```
- ```bundle install```

You must also have ```chromedriver``` in your local development environment, otherwise you will get the error. It cannot be installed as a ruby gem.
- On Mac OS:
- Method 1: with ```brew```
- ```brew install chromedriver```
- Method 2: normal installation
- ```curl http://chromedriver.storage.googleapis.com/2.38/chromedriver_mac64.zip -o chromedriver_mac64.zip```
- ```unzip chromedriver_mac64.zip```
- ```mv chromedriver /usr/local/bin```
- ```chmod +x /usr/local/bin/chromedriver```

- On Linux, and chromedriver (use proper path for your system and version):
- Install chrome first
- ```wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -```
- ```sh -c 'echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list'```
- ```apt-get update```
- ```apt-get install google-chrome-stable```
- Install chromedriver ( **use proper path for your system and version, this versin may not work for you** )
- ```cd /tmp```
- ```wget https://chromedriver.storage.googleapis.com/2.44/chromedriver_linux64.zip```
- ```sudo unzip chromedriver_linux64.zip -d /usr/local/bin```
- ```rm -f chromedriver_linux64.zip```
- Now you can leave ```tmp``` directory.

### Run app
1. In a terminal window type:
- Go inside to the project root by typing ```cd ruby-web-scraper``` .
- Go to the ```feature``` branch by typing ```git checkout feature``` .
- Then type ```./bin/main.rb``` in the root file of the project. You can also type ```ruby bin/main.rb``` in the root file of the project.
3. Follow the instructions given in the command-line interface.
- Enter the category keywords
![screenshot of step 1](./images/step1_screenshot.png)
- Enter the price range (you can also skip by pressing Enter)
![screenshot of step 2](./images/step2_screenshot.png)
- Press Enter and wait some moment to get the results. The time to get the data depends on the total amount of products found.
* ![screenshot of step 3](./images/step3_screenshot.png)
- All of the fetched products' informations are saved to [product_search_result.json](product_search_result.json) file.
![screenshot of step 3](images/step3a_screenshot.png)
- If there is no product found, then it means there is no matching category for your keywords or you gave the price range too narrow. You should run the app again and try other keywords or price range.
![screenshot of step 3](images/step3b_screenshot.png)
- Here you go! You got the 5 most reviewed products with their information.
![screenshot of step 4](./images/step4_screenshot.png)

## Testing the script

This script was tested using [RSpec](https://rspec.info/) which is a ruby testing tool. All public methods are tested.

### Install

- In a terminal window type ```gem install rspec```
- Once rspec install has finished, go to the project directory and type ```rspec --init```
- You will see a folder spec and a file [.rspec](.rspec)
- Inside spec folder you'll see a [spec_helper.rb](spec/spec_helper.rb) file.

### Run the test
- Open a terminal window and type ```rspec```
- All tests should be passed:

![screenshot of tests](./images/tests_screenshot.png)

## Authors

👤 **Ramin Mammadzada**

- Github: [@RaminMammadzada](https://github.com/RaminMammadzada)
- Twitter: [@RaminMammadzada](https://twitter.com/RaminMammadzada)
- Linkedin: [@RaminMammadzada](https://www.linkedin.com/in/raminmammadzada)

## 🤝 Contributing

Contributions, issues, and feature requests are welcome!

Feel free to check the [issues page](issues/).

## Show your support

Give a ⭐️ if you like this project!

## Acknowledgments

- Microverse
- Odin project

## 📝 License

This project is [MIT](lic.url) licensed.