https://github.com/alinebastos/crawler
Web Crawler created with Node.js and Puppeteer
https://github.com/alinebastos/crawler
crawler fs javascript nodejs puppeteer scraping
Last synced: about 1 year ago
JSON representation
Web Crawler created with Node.js and Puppeteer
- Host: GitHub
- URL: https://github.com/alinebastos/crawler
- Owner: alinebastos
- License: mit
- Created: 2018-04-02T01:31:00.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2018-04-02T02:31:09.000Z (about 8 years ago)
- Last Synced: 2025-03-03T22:02:19.053Z (over 1 year ago)
- Topics: crawler, fs, javascript, nodejs, puppeteer, scraping
- Language: JavaScript
- Homepage:
- Size: 6.84 KB
- Stars: 18
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Crawler
Web Crawler created with Node.js and [Puppeteer](https://github.com/GoogleChrome/puppeteer) to get data from [Empiricus](https://www.empiricus.com.br/conteudo/newsletters) newsletter.
### Prerequisites
To run this code you need to have **Node.js** and **npm** installed.
### Installing
After cloning this repository, inside the /crawler folder, run:
```
$ npm install
```
### Usage
Run:
```
$ node index.js
```
### License
This project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details
### Work in progress
* Reduce the number of times that *evaluate* method is used
* Change *for loop* codes for *recursion*, in order to end the code execution properly, instead of using *process.exit()* on line 53