Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ninjhacks/unja
Fetch & Filter Known URLs
https://github.com/ninjhacks/unja
Last synced: 3 months ago
JSON representation
Fetch & Filter Known URLs
- Host: GitHub
- URL: https://github.com/ninjhacks/unja
- Owner: ninjhacks
- License: gpl-3.0
- Created: 2022-01-04T18:59:07.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-03T06:26:48.000Z (over 2 years ago)
- Last Synced: 2024-07-23T16:11:15.565Z (4 months ago)
- Language: Python
- Homepage:
- Size: 33.2 KB
- Stars: 14
- Watchers: 2
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Offensive-OSINT-Tools - Unja
README
Unja
Fetch Known Urls
### What's Unja?
Unja is a fast & light tool for fetching known URLs from Wayback Machine, Common Crawl, Virus Total, UrlScan.io & AlienVault's Otx it uses a separate thread for each provider to optimize its speed and use Wayback resumption key to divide scan into multiple parts to handle a large scan & it uses direct filters on API to get only filtered data from API to do less work on your system.
### Why Unja?
- Supports `Wayback/Common-Crawl/Virus-Total/Otx/UrlScan.io`
- Automatically handles rate limits and timeouts
- Export results: text or detailed output with status,mime,length in JSON
- MultiThreading: separate thread for each provider to fetch data simultaneously
- Filters: apply filters dirtly on provider to avoid unnecessary data### Installing Unja
You can install `Unja` with pip as following:
```
pip3 install unja
```or, by downloading this repository and running
```
python3 setup.py install
```### Updating Unja
You can update `Unja` with pip as following:
```
pip3 install unja -U
```## Usage
```sh
unja -h
```This will display help for the tool.
| Flag | Description | Example |
| :---------------: | :---------------------------------------------------: | :---------------------------------------------: |
| -d | doimain | unja -d ninjhacks.com |
| -f | List of domains file seprated by new line | unja -f domains.txt |
| --sub | Include subdomain | unja --sub |
| -p | Providers (wayback,commoncrawl,otx,virustotal,urlscan)| unja -p wayback |
| --wbf | (default : statuscode:200 ~mimetype:html) | unja --wbf statuscode:200 |
| --ccf | (default : =status:200 ~mime:.*html) | unja --ccf =status:200 |
| --wbl | Wayback results per request (default : 10000) | unja --wbl 1000 |
| --otxl | Otx results per request (default : 500) | unja --otxl 500 |
| -r | Amount of retries for http client (default : 3) | unja -r 3 |
| -v | Enable verbose mode to show errors | unja -v |
| -j | Enable json mode for detailed output in json format | unja -j |
| -s | Silent mode don't print header | unja -s |
| --ucci | Update CommonCrawl Index | unja --ucci |
| --vtkey | Change VirusTotal Api in config | unja --vtkey |
| --uskey | Change UrlScan Api in config | unja --uskey |## Output Methods
text = ( default ) Output urls only.json = ( -j ) Output url,status,mime,length in json format it's can help you later filtering result based on those variables.
## Filters
Filters directly apply on providers to get only useful filtered data from provider.| Wayback | Commoncrawl | Description |
| :---------------: | :---------------: | :-----------------------------------------------------------: |
|statuscode:200 | =status:200 | return only those urls which status code is 200 |
|!statuscode:200 | !=status:200 | return only non 200 status code |
|mimetype:text/html | mime:text/html | return only those url which response type is text/html |
|!mimetype:text/html| !=mime:text/html | return only non text/html response type |
|~mimetype:html | ~mime:.*html | return all those url which have html word in response type |
|~original:unja | ~url:.*unja | return all those url which have unja word in url |## Oneliners
Get only urls with parameters & status code 200
```
unja -s -d target.com --sub -p wayback,commoncrawl --wbf 'statuscode:200 ~original:=' --ccf '=status:200 ~url:.*=' | anew | tee output
```Looking for open redirects
```
unja -s -d target.com --sub -p wayback,commoncrawl --wbf '~statuscode:30 ~original:=http' --ccf '~status:30 ~url:.*=http' | anew | tee output
```
Clean result ( Exclude images,css,javascripts,woff & 404)
```
unja -s -d target.com --sub -p wayback,commoncrawl --wbf '!statuscode:404 ~!mimetype:image ~!mimetype:javascript ~!mimetype:css ~!mimetype:woff' --ccf '!=status:404 !~mime:.*image !~mime:.*javascript !~mime:.*css !~mime:.*woff' | anew | tee output
```Let me know if you have any other good oneliner ./