Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Antosser/web-crawler

Rust Web Crawler that finds every page, image, and script on a website (and downloads it)
https://github.com/Antosser/web-crawler

crawler html rust seo web

Last synced: 6 days ago
JSON representation

Rust Web Crawler that finds every page, image, and script on a website (and downloads it)

Awesome Lists containing this project

README

        

# Web Crawler

Finds every page, image, and script on a website (and downloads it)

## Usage

```
Rust Web Crawler

Usage: web-crawler [OPTIONS]

Arguments:

Options:
-d, --download
Download all files
-c, --crawl-external
Whether or not to crawl other websites it finds a link to. Might result in downloading the entire internet
-m, --max-url-length
Maximum url length it allows. Will ignore page it url length reaches this limit [default: 300]
-e, --exclude
Will ignore paths that start with these strings (comma-seperated)
--export
Where to export found URLs
--export-internal
Where to export internal URLs
--export-external
Where to export external URLs
-t, --timeout
Timeout between requests in milliseconds [default: 100]
-h, --help
Print help
-V, --version
Print version
```

## How to compile yourself

1. Download Rust
2. Type `cargo build -r`
3. Executable is in `target/release`

**or**

1. Download Rust
2. Install using `cargo install web-crawler`