https://github.com/synacktraa/crawl
Web crawler designed to efficiently retrieve unique href, script and form links from a web application.
https://github.com/synacktraa/crawl
bash crawler regex shell web-spidering
Last synced: 3 months ago
JSON representation
Web crawler designed to efficiently retrieve unique href, script and form links from a web application.
- Host: GitHub
- URL: https://github.com/synacktraa/crawl
- Owner: synacktraa
- License: gpl-3.0
- Created: 2022-12-24T19:48:52.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-01-10T15:01:32.000Z (over 3 years ago)
- Last Synced: 2025-04-20T05:51:24.381Z (about 1 year ago)
- Topics: bash, crawler, regex, shell, web-spidering
- Language: Shell
- Homepage:
- Size: 65.4 KB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
crawl
A simple web crawler written in shell script, designed to efficiently discover endpoints and assets within a web application.
## Usage
```bash
$ crawl
```
```
$ crawl -h
|Usage:
| crawl [-f] [href|script|form]
|
|Options:
| -h show help menu
| -d number of depth to scrape.
| -f attribute a type of link. [href|script|form]
|
|Example:
| crawl -f script [domain].[TLD]
| crawl -d 1 [domain].[TLD]/directory
| crawl -f href [domain].[TLD]/directory?key=value
|
|Fetches all href, script and form links, if no flags are specified.
|Uses HTTPs as default protocol, if no protocol is specified.
```
```bash
$ echo google.com | crawl
```
[](https://asciinema.org/a/1nQQGtpE6q8qweVS2q9dfpAaa)
## Tool Chain
Get all subdomains of owasp.org and crawl the ones that are alive.
```bash
subfinder -d owasp.org | httpx | crawl
```
## Features
- Fetches all href, script and form links.
- Highlights the Depth-1 URLs and indicates which Depth-2 URLs are included under each Depth-1 URL.
- Uses HTTPs as default protocol, if no protocol is specified.
- Depth is set to 1 by default, if depth is not specified.
- Can be linked with other tool(s)/command(s) using pipes to create a tool chain.
- Output will be color-less if it's redirected to a file or piped to a another command.
## Installation
```bash
git clone https://github.com/synacktraa/crawl.git && cd ./crawl
sudo mv ./crawl /usr/local/bin
cd .. && rm -rf "./crawl"
```
## Dependencies
- curl
- sed
- grep
- awk
- wc
- sort
- uniq