Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tomnomnom/unfurl
Pull out bits of URLs provided on stdin
https://github.com/tomnomnom/unfurl
Last synced: 6 days ago
JSON representation
Pull out bits of URLs provided on stdin
- Host: GitHub
- URL: https://github.com/tomnomnom/unfurl
- Owner: tomnomnom
- License: mit
- Created: 2018-02-26T10:09:16.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2023-08-12T04:13:27.000Z (over 1 year ago)
- Last Synced: 2024-11-29T09:06:33.915Z (13 days ago)
- Language: Go
- Size: 50.8 KB
- Stars: 1,101
- Watchers: 18
- Forks: 123
- Open Issues: 9
-
Metadata Files:
- Readme: README.mkd
- Contributing: CONTRIBUTING.mkd
- License: LICENSE
Awesome Lists containing this project
- awesome-bugbounty-tools - unfurl - Pull out bits of URLs provided on stdin (Miscellaneous / Useful)
- awesome-rainmana - tomnomnom/unfurl - Pull out bits of URLs provided on stdin (Go)
- WebHackersWeapons - unfurl
- awesome-api-security - unfurl
README
# unfurl
Pull out bits of URLs provided on `stdin`
## Install
If you have Go installed and configured:
```
▶ go install github.com/tomnomnom/unfurl@latest
```Otherwise [download the latest binary for your platform](https://github.com/tomnomnom/unfurl/releases),
extract it and move it to somewhere in your `$PATH` (e.g. `/usr/bin/`):```
▶ wget https://github.com/tomnomnom/unfurl/releases/download/v0.0.1/unfurl-linux-amd64-0.0.1.tgz
▶ tar xzf unfurl-linux-amd64-0.0.1.tgz
▶ sudo mv unfurl /usr/bin/
```## Usage
unfurl works with URLs provided on stdin; they might come from a file like this one:
```
▶ cat urls.txt
https://sub.example.com/users?id=123&name=Sam
https://sub.example.com/orgs?org=ExCo#about
http://example.net/about#contact
```### Domains
You can extract the domains from the URLs with the `domains` mode:
```
▶ cat urls.txt | unfurl domains
sub.example.com
sub.example.com
example.net
```If you don't want to output duplicate values you can use the `-u` or `--unique` flag:
```
▶ cat urls.txt | unfurl --unique domains
sub.example.com
example.net
```The `-u`/`--unique` flag works for all modes.
### Apex Domains
You can extract the apex part of the domain (e.g. the `example.com` in `http://sub.example.com`) using the `apexes` mode:
```
▶ cat urls.txt | unfurl -u apexes
example.com
example.net
```### Paths
```
▶ cat urls.txt | unfurl paths
/users
/orgs
/about
```### Query String Keys
```
▶ cat urls.txt | unfurl keys
id
name
org
```### Query String Values
```
▶ cat urls.txt | unfurl values
123
Sam
ExCo
```### Query String Key/Value Pairs
```
▶ cat urls.txt | unfurl keypairs
id=123
name=Sam
org=ExCo
```### JSON
```
▶ cat urls.txt | unfurl json
{"scheme":"https","opaque":"","user":"","host":"sub.example.com","path":"/users","raw_path":"","raw_query":"id=123\u0026name=Sam","fragment":"","parameters":[{"key":"id","value":"123"},{"key":"name","value":"Sam"}],"url":"https://sub.example.com/users?id=123\u0026name=Sam","domain":"sub.example.com","subdomain":"sub","root":"example","tld":"com","apex":"example.com","port":"","extension":""}
{"scheme":"https","opaque":"","user":"","host":"sub.example.com","path":"/orgs","raw_path":"","raw_query":"org=ExCo","fragment":"about","parameters":[{"key":"org","value":"ExCo"}],"url":"https://sub.example.com/orgs?org=ExCo#about","domain":"sub.example.com","subdomain":"sub","root":"example","tld":"com","apex":"example.com","port":"","extension":""}
{"scheme":"http","opaque":"","user":"","host":"example.net","path":"/about","raw_path":"","raw_query":"","fragment":"contact","parameters":null,"url":"http://example.net/about#contact","domain":"example.net","subdomain":"","root":"example","tld":"net","apex":"example.net","port":"","extension":""}
```### Custom Formats
You can use the `format` mode to specify a custom output format:
```
▶ cat urls.txt | unfurl format %d%p
sub.example.com/users
sub.example.com/orgs
example.net/about
```The available format directives are:
```
%% A literal percent character
%s The request scheme (e.g. https)
%u The user info (e.g. user:pass)
%d The domain (e.g. sub.example.com)
%S The subdomain (e.g. sub)
%r The root of domain (e.g. example)
%t The TLD (e.g. com)
%P The port (e.g. 8080)
%p The path (e.g. /users)
%e The path's file extension (e.g. jpg, html)
%q The raw query string (e.g. a=1&b=2)
%f The page fragment (e.g. page-section)
%@ Inserts an @ if user info is specified
%: Inserts a colon if a port is specified
%? Inserts a question mark if a query string exists
%# Inserts a hash if a fragment exists
%a Authority (alias for %u%@%d%:%P)
```Any characters that don't match a format directive remain untouched:
```
▶ cat urls.txt | unfurl -u format "%d (%s)"
sub.example.com (https)
example.net (http)
```Note that if a URL does not include the data requested, there will be no output for that URL:
```
▶ echo http://example.com | unfurl format "%P"
▶ echo http://example.com:8080 | unfurl format "%P"
8080
```## Help
```
▶ unfurl -h
Format URLs provided on stdinUsage:
unfurl [OPTIONS] [MODE] [FORMATSTRING]Options:
-u, --unique Only output unique values
-v, --verbose Verbose mode (output URL parse errors)Modes:
keys Keys from the query string (one per line)
values Values from the query string (one per line)
keypairs Key=value pairs from the query string (one per line)
domains The hostname (e.g. sub.example.com)
paths The request path (e.g. /users)
apexes The apex domain (e.g. example.com from sub.example.com)
json JSON encoded url/format objects
format Specify a custom format (see below)Format Directives:
%% A literal percent character
%s The request scheme (e.g. https)
%u The user info (e.g. user:pass)
%d The domain (e.g. sub.example.com)
%S The subdomain (e.g. sub)
%r The root of domain (e.g. example)
%t The TLD (e.g. com)
%P The port (e.g. 8080)
%p The path (e.g. /users)
%e The path's file extension (e.g. jpg, html)
%q The raw query string (e.g. a=1&b=2)
%f The page fragment (e.g. page-section)
%@ Inserts an @ if user info is specified
%: Inserts a colon if a port is specified
%? Inserts a question mark if a query string exists
%# Inserts a hash if a fragment exists
%a Authority (alias for %u%@%d%:%P)Examples:
cat urls.txt | unfurl keys
cat urls.txt | unfurl format %s://%d%p?%q
```