Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Cillian-Collins/subscraper

Reconnaissance tool which scans javascript files for subdomains and then iterates over all javascript files hosted on subsequent subdomains to enumerate a list of subdomains for a given URL.
https://github.com/Cillian-Collins/subscraper

Last synced: about 1 month ago
JSON representation

Reconnaissance tool which scans javascript files for subdomains and then iterates over all javascript files hosted on subsequent subdomains to enumerate a list of subdomains for a given URL.

Awesome Lists containing this project

README

        





SUBSCRAPER


Reconnaissance tool which scans javascript files for subdomains and then iterates over all javascript files hosted on subsequent subdomains to enumerate a list of subdomains for a given URL.

## Features

* Scans a domain and identifies all subdomains in javascript files.
* Scans subdomains and identifies all subdomains in subsequent files.
* Continues until no new subdomains are identified.

## Install

To install you should first clone this repository and then open the command line in the cloned directory and run the install command below.

```
pip install -r requirements.txt
```

## Parameters

```
Syntax:
$ python subscraper.py -u youtube.com -o output.txt
$ python subscraper.py -u youtube.com -v
$ python subscraper.py -u youtube.com -o output.txt -v

Options:
-h, --help show this help message and exit
-u URL of the website to scan.
-o Output file (for results).
-v Enables verbosity
```

## Contributions

There's a lot of work left to do here, specifically relating to the whitelisting of which javascript files we scan and which we ignore. Generally speaking, for a domain ``youtube.com`` we would look to check any files which are relative ``script.js`` and ``/scripts/script.js``. We should also look to include all javascript files hosted on ``*.youtube.com`` and if possible, even include these subdomains in our output, assuming they are not already included in our output. This can often happen if CDNs are hosting javascript files so it's important not to miss anything.