https://github.com/sevlamare/web_scraper
Get data from web pages to tables.
https://github.com/sevlamare/web_scraper
nokogiri rspec ruby unit-testing
Last synced: 3 months ago
JSON representation
Get data from web pages to tables.
- Host: GitHub
- URL: https://github.com/sevlamare/web_scraper
- Owner: SevlaMare
- Created: 2020-04-14T18:05:04.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-05-19T14:42:33.000Z (about 5 years ago)
- Last Synced: 2025-02-04T14:50:13.365Z (5 months ago)
- Topics: nokogiri, rspec, ruby, unit-testing
- Language: Ruby
- Homepage:
- Size: 75.2 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Scraper
> Scan for open positions of any kind and export data into a table.
The main purpose here is to deploy a working app
Having to learn a new librarie in a short amount of time (Less than 3 days).## Content
* [Requirements](#requirements)
* [Built With](#built-with)
* [Getting Start](#getting-start)
* [Contributing](#contributing)
* [Acknowledgments](#acknowledgments)
* [Author](#author)
* [License](#license)## Requirements
- Should use Nokogiri
- Should use Oriented Object Programming
- Should have unit tests
- Follow good practices
- Make it friendly to use and modify
## Built With
- Ruby Language
- Nokogiri
- RSpec (Unit testing)
- Rubocop (Linter) with Stickler (CI Tool)
- Git, Github and VScode
## Getting Start
#### Install
To use this program, you will need install:
* Ruby Language - Version 2.5.5 or high - [Install guide](https://www.ruby-lang.org/en/documentation/installation/)
* Nokogiri Gem - Version 1.10.9 or high - [Install guide](https://nokogiri.org/tutorials/installing_nokogiri.html)
if you are using Ubuntu, you can install both on terminal, just typing:
```js
sudo apt-get install ruby-full
```
and
```js
gem install nokogiri
```
#### Get a local copy
Now you need a copy of this application, if you are using Git:
```js
git clone [email protected]:SevlaMare/Web_Scraper.git
```
Otherwise just hit (Download Zip) on green button (Clone or Download) at top of this page.
#### Run (With default settings)
To run, from the application folder, just type on terminal:
```js
ruby bin/main.rb
```
When run, by default, it will generate a CSV file with data
from one page, scrapped for 'Full Stack' open positions.
#### Modify Parameters
If you want to change which kind of position it will look for,
or of which page you want to get the results, do the following steps:
* Open the directory `/bin` from application folder
* Open the file `main.rb` with any text editor
* At the last line, you have this:
`show_all(3, true, 'ruby rails')`
* The number 3 is the page which you want to scrap, can by any available number.
* You can change the `true` to `false` if you don't want the CSV file.
* The 'ruby rails' you can change to any keyword that you want to search for open positions.
* After the changes, save the file and run it on terminal.
#### Output file
After running `data.csv` will be generated
It contains 4 rows, with the position name, location, salary and description respectively.
Each column will be one job post.
It comes ready to be imported in any database or any spreadsheet program.
### Contributing
Contributions, issues and feature requests are welcome!
You can do it on [issues page](issues/).
## Acknowledgments
A special thanks for the code reviewers.
## Author
👤 **Thiago Miranda**
- Github: [@SevlaMare](https://github.com/SevlaMare)
- Twitter: [#SevlaMare)](https://twitter.com/SevlaMare)
- Linkedin: [SevlaMare)](https://www.linkedin.com/in/sevlamare)
### License
Creative Commons