Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rfalke/zehner
web crawler with node.js
https://github.com/rfalke/zehner
Last synced: about 16 hours ago
JSON representation
web crawler with node.js
- Host: GitHub
- URL: https://github.com/rfalke/zehner
- Owner: rfalke
- Created: 2015-06-02T09:03:01.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2015-06-08T17:21:12.000Z (over 9 years ago)
- Last Synced: 2023-03-15T14:25:26.071Z (over 1 year ago)
- Language: JavaScript
- Size: 168 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Zehner
======Zehner is an url fetcher like wget using node.js
## Usage
``` bash
$ zehner.js -h
usage: zehner.js [-h] [-v] [-o DIR] [-p P] [--r1] [--r2] [--r3] URLZehner: A web crawler written in node.js.
Positional arguments:
URL the start url. Without any --r* flags only this url will be
fetched.Optional arguments:
-h, --help Show this help message and exit.
-v, --version Show program's version number and exit.
-o DIR set the output directory [defaults to "."]
-p P download with CONNECTIONS in parallel [defaults to 5]
--r1 limit recursive download to the sub directory of the initial
URL
--r2 limit recursive download to host of the initial URL
--r3 do not limit the recursive download
```# Examples
* Download single file
``` bash
$ zehner.js http://www.example.com
```* Dive into sub directories
``` bash
$ zehner.js -o output_dir --r1 http://www.example.com/some/path
```* Fetch all references files from the host
``` bash
$ zehner.js -o output_dir --r2 http://www.example.com/some/path
```* Fetch all references files from all hosts
``` bash
$ zehner.js -o output_dir --r3 http://www.example.com/some/path
```