Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/okdistribute/nutella-scrape
:chocolate_bar: learn to scrape the web with Node.js -- it tastes like chocolate
https://github.com/okdistribute/nutella-scrape
Last synced: about 2 months ago
JSON representation
:chocolate_bar: learn to scrape the web with Node.js -- it tastes like chocolate
- Host: GitHub
- URL: https://github.com/okdistribute/nutella-scrape
- Owner: okdistribute
- Created: 2015-08-14T07:03:54.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2016-05-19T11:45:03.000Z (over 8 years ago)
- Last Synced: 2024-10-17T17:53:37.602Z (about 2 months ago)
- Language: JavaScript
- Homepage:
- Size: 166 KB
- Stars: 203
- Watchers: 11
- Forks: 12
- Open Issues: 1
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
- awesome-starred - okdistribute/nutella-scrape - :chocolate_bar: learn to scrape the web with Node.js -- it tastes like chocolate (others)
README
# nutella-scrape
[![NPM](https://nodei.co/npm/nutella-scrape.png?downloads=true&stars=true&global=true)](https://nodei.co/npm/nutella-scrape/)
![nutella](https://github.com/karissa/nutella-scrape/blob/master/nutella.png)
1. Run `sudo npm install nutella-scrape -g`
2. Run `nutella-scrape`
3. ???
4. LEARN!!In this tutorial, we will work through how to scrape websites using Node.js for the primary purpose of using it in other programs -- in servers, frontends (yes, Node works in the browser!), or just writing a table to disk for analysis elsewhere.
The DOM (Document Object Model) is an abstract concept describing how we can interact with HTML. JavaScript is GREAT for traversing HTML (i.e., the DOM) because it was made to work with HTML in the first place.
## TODO
* parallel
* spoofing
* cookies/login walls
* electron-microscope