Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dwisiswant0/galer
A fast tool to fetch URLs from HTML attributes by crawl-in.
https://github.com/dwisiswant0/galer
crawler devtool extractor galer go golang spider url-extractor url-parser waybackurls
Last synced: 3 days ago
JSON representation
A fast tool to fetch URLs from HTML attributes by crawl-in.
- Host: GitHub
- URL: https://github.com/dwisiswant0/galer
- Owner: dwisiswant0
- License: mit
- Created: 2020-11-26T03:45:15.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2025-01-06T23:28:36.000Z (19 days ago)
- Last Synced: 2025-01-15T11:35:34.935Z (10 days ago)
- Topics: crawler, devtool, extractor, galer, go, golang, spider, url-extractor, url-parser, waybackurls
- Language: Go
- Homepage:
- Size: 70.3 KB
- Stars: 255
- Watchers: 6
- Forks: 39
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
- project-awesome - dwisiswant0/galer - A fast tool to fetch URLs from HTML attributes by crawl-in. (Go)
- awesome-hacking-lists - dwisiswant0/galer - A fast tool to fetch URLs from HTML attributes by crawl-in. (Go)
README
# galer
[![made-with-Go](https://img.shields.io/badge/made%20with-Go-blue.svg)](http://golang.org)
[![issues](https://img.shields.io/github/issues/dwisiswant0/galer?color=blue)](https://github.com/dwisiswant0/galer/issues)```txt
__
__ _ _(_ ) __ _ __
/'_ '\/'_' )| | /'__'( '__)
( (_) ( (_| || |( ___| |
'\__ '\__,_(___'\____(_)
( )_) |
\___/' @dwisiswant0
```A fast tool to fetch URLs from HTML attributes by crawl-in. Inspired by the [@omespino Tweet](https://twitter.com/omespino/status/1318605084989837312), which is possible to extract `src`, `href`, `url` and `action` values by evaluating JavaScript through Chrome DevTools Protocol.
---
## Resources
- [Installation](#installation)
- [from Binary](#from-binary)
- [from Source](#from-source)
- [from GitHub](#from-github)
- [Usage](#usage)
- [Basic Usage](#basic-usage)
- [Flags](#flags)
- [Examples](#examples)
- [Single URL](#single-url)
- [URLs from list](#urls-from-list)
- [from Stdin](#from-stdin)
- [Library](#library)
- [TODOs](#todos)
- [Help & Bugs](#help--bugs)
- [License](#license)
- [Version](#version)
- [Acknowledgement](#acknowledgement)## Installation
### from Binary
The installation is easy. You can download a prebuilt binary from [releases page](https://github.com/dwisiswant0/galer/releases), unpack and run! or with
```bash
▶ (sudo) curl -sSfL https://git.io/galer | sh -s -- -b /usr/local/bin
```### from Source
If you have go1.22+ compiler installed and configured:
```bash
▶ go install -v github.com/dwisiswant0/galer@latest
```### from GitHub
```bash
▶ git clone https://github.com/dwisiswant0/galer
▶ cd galer
▶ go build .
▶ (sudo) install galer /usr/local/bin
```## Usage
### Basic Usage
Simply, galer can be run with:
```bash
▶ galer -u "http://domain.tld"
```### Flags
![galer](https://user-images.githubusercontent.com/25837540/100824601-0ee53b80-3489-11eb-878d-a58d1ec3489d.jpg)
This will display help for the tool. Here are all the options it supports.
```console
$ galer -h__ v0.2.0
__ _ _(_ ) __ _ __
/'_ '\/'_' )| | /'__'( '__)
( (_) ( (_| || |( ___| |
'\__ '\__,_(___'\____(_)
( )_) |
\___/' @dwisiswant0A fast tool to fetch URLs from HTML attributes by crawl-in
Usage:
galer -u [URL|URLs.txt] -o [output.txt]Options:
-u, --url Target to fetches (single target URL or list)
-e, --extension Show only certain extensions (comma-separated, e.g. js,php)
-c, --concurrency Concurrency level (default: 50)
-w, --wait Wait N seconds before evaluate (default: 1)
-d, --depth Max. depth for crawling (levels of links to follow)
--same-host Same host only
--same-root Same root (eTLD+1) only (takes precedence over --same-host)
-o, --output Save fetched URLs output into file
-T, --template Format for output template (e.g., "{{scheme}}://{{host}}{{path}}")
Valid variables are: "raw_url", "scheme", "user", "username",
"password", "host", "hostname", "port", "path", "raw_path",
"escaped_path", "raw_query", "fragment", "raw_fragment".
-t, --timeout Max. time (seconds) allowed for connection (default: 60)
-s, --silent Silent mode (suppress an errors)
-v, --verbose Verbose mode show error details unless you weren't use silent
-h, --help Display its helps
```### Examples
#### Single URL
```bash
▶ galer -u "http://domain.tld"
```#### URLs from list
```bash
▶ galer -u /path/to/urls.txt
```#### from Stdin
```bash
▶ cat urls.txt | galer
```In case you want to chained with other tools:
```bash
▶ subfinder -d domain.tld -silent | httpx -silent | galer
```### Library
[![godoc](https://img.shields.io/badge/godoc-reference-blue.svg)](https://godoc.org/github.com/dwisiswant0/galer/pkg/galer)
You can use **galer** as library.
```
▶ go get github.com/dwisiswant0/galer/pkg/galer@latest
```For example:
```go
package mainimport (
"fmt""github.com/dwisiswant0/galer/pkg/galer"
)func main() {
cfg := &galer.Config{
Timeout: 60,
}
cfg = galer.New(cfg)run, err := cfg.Crawl("https://twitter.com")
if err != nil {
panic(err)
}for _, url := range run {
fmt.Println(url)
}
}
```## TODOs
- [ ] Enable to set extra HTTP headers
- [ ] Provide randomly User-Agent
- [ ] Bypass headless browser
- [ ] Add exception for specific extensions## Help & Bugs
[![contributions welcome](https://img.shields.io/badge/contributions-welcome-blue.svg)](https://github.com/dwisiswant0/galer/issues)
If you are still confused or found a bug, please [open the issue](https://github.com/dwisiswant0/galer/issues). All bug reports are appreciated, some features have not been tested yet due to lack of free time.
## Status
> [!CAUTION]
> galer has NOT reached 1.0 yet. Therefore, this library is currently not supported and does not offer a stable API; use at your own risk.There are no guarantees of stability for the APIs in this library, and while they are not expected to change dramatically. API tweaks and bug fixes may occur.
## Pronunciation
`id_ID` • **/gäˈlər/** — kalau _galer_ jangan dicium baunya, langsung cuci tangan, _bego_!
## Acknowledgement
- [Omar Espino](https://twitter.com/omespino) for the idea, that's why this tool was made!
### License
`sebel` is released by **@dwisiswant0** under the MIT license. See [LICENSE](/LICENSE).