{"id":13642238,"url":"https://github.com/ksharinarayanan/SourceWolf","last_synced_at":"2025-04-20T16:30:55.101Z","repository":{"id":57469467,"uuid":"282457989","full_name":"ksharinarayanan/SourceWolf","owner":"ksharinarayanan","description":"Amazingly fast response crawler to find juicy stuff in the source code! 😎🔥","archived":false,"fork":false,"pushed_at":"2023-09-18T08:55:47.000Z","size":360,"stargazers_count":145,"open_issues_count":2,"forks_count":46,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-10-29T16:58:23.522Z","etag":null,"topics":["automation","broken-link-hijacking","bugbounty","fuzzing","osint","reconnaissance","wordlist"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ksharinarayanan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-07-25T14:16:26.000Z","updated_at":"2024-10-08T14:49:02.000Z","dependencies_parsed_at":"2024-01-14T10:18:29.088Z","dependency_job_id":"0e9626fd-6628-4bd5-a41c-d8150e69c492","html_url":"https://github.com/ksharinarayanan/SourceWolf","commit_stats":null,"previous_names":["micha3lb3n/sourcewolf"],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksharinarayanan%2FSourceWolf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksharinarayanan%2FSourceWolf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksharinarayanan%2FSourceWolf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ksharinarayanan%2FSourceWolf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ksharinarayanan","download_url":"https://codeload.github.com/ksharinarayanan/SourceWolf/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223832874,"owners_count":17210735,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","broken-link-hijacking","bugbounty","fuzzing","osint","reconnaissance","wordlist"],"created_at":"2024-08-02T01:01:28.882Z","updated_at":"2024-11-09T13:30:49.694Z","avatar_url":"https://github.com/ksharinarayanan.png","language":"Python","funding_links":[],"categories":["[](#table-of-contents) Table of contents","Python"],"sub_categories":["[](#website-analyze)Website analyze"],"readme":"\u003cp align=\"center\"\u003e\r\n  \u003cimg src=\"https://github.com/ksharinarayanan/SourceWolf/blob/master/images/logo.png\" width=\"130px\" height=\"130px\"\u003e\r\n  \u003cbr\u003e\r\n  \u003cbr\u003e\r\n  \u003ch1 align=\"center\"\u003eSourceWolf\u003c/h1\u003e\r\n\u003c/p\u003e\r\n\r\n\u003cp align=\"center\"\u003e\r\n  \u003cimg src=\"https://img.shields.io/badge/Release-v1.7-brightgreen\"\u003e\r\n  \u003cimg src=\"https://img.shields.io/github/issues-closed/ksharinarayanan/SourceWolf\"\u003e\r\n  \u003cimg src=\"https://img.shields.io/github/issues-pr-closed/ksharinarayanan/SourceWolf\"\u003e\r\n  \u003cimg src=\"https://img.shields.io/github/contributors/ksharinarayanan/SourceWolf\"\u003e\r\n\u003c/p\u003e\r\n\r\n​\r\n\r\n**Tested environments:** Windows, MAC, linux, and windows subsystem for linux (WSL)\r\n\r\n## Sponsors\r\n\r\nSupport this project by \u003ca href=\"mailto:ksharinarayanan36@gmail.com\"\u003ebecoming\u003c/a\u003e a sponsor. Checkout these awesome sponsors:\r\n\r\nOxylabs provides premium proxies with 100M+ IPs across 195 countries and ready-to-use Scraper APIs. You can use this unique promo code **boost35** to get 35% off Residential, Mobile proxies, Web Unblocker, and Scraper APIs.\r\n\r\n\u003ca href=\"https://oxylabs.go2cloud.org/aff_c?offer_id=7\u0026aff_id=997\u0026url_id=120\"\u003e\r\n\u003cimg src=\"https://raw.githubusercontent.com/ksharinarayanan/SourceWolf/master/images/oxylabs.jpeg\" width=\"400\" height=\"200\" /\u003e\r\n\u003c/a\u003e\r\n\r\n\r\n## Sections\r\n\r\n-   \u003ca href=\"#features\"\u003eFeatures\u003c/a\u003e\r\n-   \u003ca href=\"#installation\"\u003eInstallation\u003c/a\u003e\r\n-   \u003ca href=\"#usage\"\u003eUsage\u003c/a\u003e\r\n-   \u003ca href=\"#workflow\"\u003eHow can this be integrated into your workflow?\u003c/a\u003e\r\n-   \u003ca href=\"#todo\"\u003eTo do\u003c/a\u003e\r\n-   \u003ca href=\"#update\"\u003eUpdating SourceWolf\u003c/a\u003e\r\n-   \u003ca href=\"#contributions\"\u003eContributions\u003c/a\u003e\r\n-   \u003ca href=\"#issues\"\u003eIssues\u003c/a\u003e\r\n-   \u003ca href=\"#naming\"\u003eFile naming conventions\u003c/a\u003e\r\n\r\n\u003cdiv id=\"features\"\u003e\r\n\u003ch3\u003e What can SourceWolf do?\u003c/h3\u003e\r\n\r\n-   Crawl through responses to find hidden endpoints, either by sending requests, or from the local response files (if any).\r\n\r\n-   Create a list of javascript variables found in the source\r\n\r\n-   Extract all the social media links from the websites to identify potentially broken links\r\n\r\n-   Brute forcing host using a wordlist.\r\n\r\n-   Get the status codes for a list of URLs / Filtering out the live domains from a list of hosts.\r\n\r\nAll the features mentioned above execute with great speed.\r\n\r\n-   SourceWolf uses the **Session** module from the requests library, which means, it reuses the TCP connection, making it really fast.\r\n\r\n-   SourceWolf provides you with an option to crawl the responses files **locally** so that you aren't sending requests again to an endpoint, whose response you already have a copy of.\r\n\r\n-   The final endpoints are in a complete form with a host like `https://example.com/api/admin` are not as `/api/admin`. This can come useful, when you are scanning a list of hosts.\r\n\r\n\u003c/div\u003e\r\n\u003chr\u003e\r\n\r\n### Installation\r\n\r\n-   git clone https://github.com/ksharinarayanan/SourceWolf (or) Download the latest \u003ca href=\"https://github.com/ksharinarayanan/SourceWolf/releases\"\u003erelease\u003c/a\u003e!\r\n-   cd SourceWolf/\r\n-   pip3 install -r requirements.txt\r\n\r\n\u003chr\u003e\r\n\r\n### Usage\r\n\r\n```\r\n\u003e python3 sourcewolf.py -h\r\n\r\n-l LIST, --list LIST  List of javascript URLs\r\n-u URL, --url URL     Single URL\r\n-t THREADS, --threads THREADS\r\n                      Number of concurrent threads to use (default 5)\r\n-o OUTPUT_DIR, --output directory-name OUTPUT_DIR\r\n                      Store URL response text in a directory for further analysis\r\n-s STATUS_CODE_FILE, --store-status-code STATUS_CODE_FILE\r\n                      Store the status code in a file\r\n-b BRUTE, --brute BRUTE\r\n                      Brute force URL with FUZZ keyword (--wordlist must also be used along with this)\r\n-w WORDLIST, --wordlist WORDLIST\r\n                      Wordlist for brute forcing URL\r\n-v, --verbose         Verbose mode (displays all the requests that are being sent)\r\n-c CRAWL_OUTPUT, --crawl-output CRAWL_OUTPUT\r\n                      Output directory to store the crawled output\r\n-d DELAY, --delay DELAY\r\n                      Delay in the requests (in seconds)\r\n--timeout TIMEOUT     Maximum time to wait for connection timing out (in seconds)\r\n--headers HEADERS     Add custom headers (Must be passed in as {'Token': 'YOUR-TOKEN-HERE'}) --\u003e Dictionary format\r\n--cookies COOKIES     Add cookies (Must be passed in as {'Cookie': 'YOUR-COOKIE-HERE'}) --\u003e Dictionary format\r\n--only-success        Only print 2XX responses\r\n--local LOCAL         Directory with local response files to crawl for\r\n--no-colors           Remove colors from the output\r\n--update-info         Check for the latest version, and update if required\r\n```\r\n\r\n\u003cdiv id=\"test\"\u003eSourceWolf has \u003cb\u003e3 modes\u003c/b\u003e, which corresponds to it's \u003cb\u003e3 core features\u003c/b\u003e.\u003c/div\u003e\r\n\r\n-   #### Crawl response mode:\r\n\r\n![](https://github.com/ksharinarayanan/SourceWolf/blob/master/images/crawl.JPG)\r\n\r\nComplete usage:\r\n\r\n```\r\n  python3 sourcewolf.py -l domains -o output/ -c crawl_output\r\n```\r\n\r\n`domains` is the list of URLs, which you want to crawl in the format:\r\n\r\n```\r\nhttps://example.com/\r\nhttps://exisiting.example.com/\r\nhttps://exisiting.example.com/dashboard\r\nhttps://example.com/hitme\r\n```\r\n\r\n`output/` is the directory where the response text files of the input file are stored. \u003cbr/\u003e\r\n\r\n\u003e They are stored in the format output/2XX, output/3XX, output/4XX, and output/5XX. \u003cbr\u003e\r\n\u003e output/2XX stores 2XX status code responses, and so on!\r\n\r\n\u003cbr\u003e\r\n\r\n`crawl_output` specified using the `-c` flag is used to store the output, inside a directory which SourceWolf produces by crawling the HTTP response files, stored inside the `output/` directory (currently only endpoints)\r\n\r\nThe ```crawl_output/``` directory contains:\r\n\r\nendpoints - All the endpoints found\r\n\u003cbr\u003e\r\njsvars - All the javascript variables\r\n\r\n\u003e The directory will have more files, as more modules, and features are integrated into SourceWolf.\r\n\r\n\u003cbr\u003e\r\n\r\n\u003cp align=\"center\"\u003e\u003cb\u003e(OR)\u003c/b\u003e\u003c/p\u003e\r\n\r\nFor a single URL, \u003cbr\u003e\r\n\r\n```\r\n  python3 sourcewolf.py -u example.com/api/endpoint -o output/ -c crawl_output\r\n```\r\n\r\nOnly the flag `-l` is replaced by `-u`, everything else remains the same.\r\n\r\n\u003cbr\u003e\r\n\r\n-   #### Brute force mode\r\n\r\n![](https://github.com/ksharinarayanan/SourceWolf/blob/master/images/brute.JPG)\r\n\r\n```\r\npython3 sourcewolf.py -b https://hackerone.com/FUZZ -w /path/to/wordlist -s status\r\n```\r\n\r\n`-w` flag is optional. If not specified, it will use a default wordlist with 6124 words\r\n\r\nSourceWolf replace the `FUZZ` keyword from the `-b` value with the words from wordlist, and sends the requests. This enables you to brute force get parameter values as well.\r\n\r\n`-s` will store the output in a file called `status`\r\n\r\n-   #### Probing mode\r\n\r\n\u003e Screenshot not included as the output looks similar to `crawl response` mode.\r\n\r\n```\r\npython3 sourcewolf -l domains -s live\r\n```\r\n\r\nThe `domains` file can have anything like subdomains, endpoints, js files.\r\n\u003cbr\u003e\r\n\r\nThe `-s` flag write the response to the `live` file.\r\n\r\n\u003e Both the brute force and probing mode prints all the status codes except 404 by default. You can customize this behavior to print only `2XX` responses by using the flag `--only-success`\r\n\r\nSourceWolf also makes use of multithreading.\r\n\u003cbr\u003e\r\nThe default number of threads for all modes is 5. You can increase the number of threads using the `-t` flag.\r\n\r\nIn addition to the above three modes, there is an option crawl locally, provided you have them locally, and follow \u003ca href=\"#naming\"\u003esourcewolf compatible naming conventions.\u003c/a\u003e\r\n\r\nStore all the responses in a directory, say `responses/`\r\n\r\n```\r\npython3 sourcewolf.py --local responses/\r\n```\r\n\r\nThis will crawl the local directory, and give you the results.\r\n\r\n\u003chr\u003e\r\n\r\n### How can this be integrated into your workflow?\r\n\r\n\u003cbr\u003e\r\n\u003cp align=\"center\"\u003e\r\n  Subdomain enumeration \u003cbr\u003e\r\n  \u003cb\u003e| \u003cbr\u003e\r\n  | \u003cbr\u003e\u003c/b\u003e\r\n  SourceWolf \u003cbr\u003e\r\n  \u003cb\u003e| \u003cbr\u003e\r\n  | \u003cbr\u003e\u003c/b\u003e\r\n  Filter out live subdomains \u003cbr\u003e\r\n  \u003cb\u003e| \u003cbr\u003e\r\n  | \u003cbr\u003e\u003c/b\u003e\r\n  Store responses and find hidden endpoints / Directory brute forcing \u003cbr\u003e\r\n\u003c/p\u003e\r\n\r\nAt this point, you will have a lot of endpoints from the target, extracted real time from the web pages at the time of performing the scan.\r\n\r\n\u003chr\u003e\r\n\r\nSourceWolf core purpose is made with a broader vision to crawl through responses not just for discovering hidden endpoints, but also for automating all the tasks which are done by manually searching through the response files.\r\n\r\n\u003e One such example would be manually searching for any leaked keys in the source.\r\n\r\nThis core purpose explains the modular way in which the files are written.\r\n\r\n\u003cdiv id=\"#todo\"\u003e\r\n\r\n## To do\r\n\r\n-   Generate a custom wordlist for a target from the words obtained in the source.\r\n-   Automate finding any leaked keys.\r\n\r\n\u003c/div\u003e\r\n\r\n\u003cdiv id=\"#update\"\u003e\r\n\r\n### Updates\r\n\r\nIt is possible to update SourceWolf right from the terminal, without you having to clone the repository again.\r\n\u003cbr\u003e\r\nSourceWolf checks for updates everytime it runs, and notifies the user if there are any updates available along with a summary of it.\r\n\u003cbr\u003e\r\n![](https://github.com/ksharinarayanan/SourceWolf/blob/master/images/update.JPG)\r\n\r\nRunning\r\n\r\n```\r\npython3 sourcewolf.py --update-info\r\n```\r\n\r\nprovides more details on the update\r\n\u003cbr\u003e\r\n![](https://github.com/ksharinarayanan/SourceWolf/blob/master/images/update-info.JPG)\r\n\r\nWhen there are updates available, you must move the update.py file outside of the SourceWolf directory, and run\r\n\u003cbr\u003e\r\n**Warning: This deletes all the files and folders inside your SourceWolf directory**\r\n\r\n```\r\npython3 update.py /path/to/SourceWolf\r\n```\r\n\r\nThis actually removes the directory, and clones back the repo.\r\n\r\n\u003c/div\u003e\r\n\r\n\u003cdiv id=\"#contributions\"\u003e\r\n\r\n### Contributions\r\n\r\nCurrently, sourcewolf supports only finding hidden endpoints from the source, but you can expect other features to be integrated in the future.\r\n\r\n\u003c/div\u003e\r\n\r\n**Where can you contribute?**\r\n\u003cbr\u003e\r\nContributions are mainly required for integrating more modules, with sourcewolf, though feel free to open a PR even if it's a typo.\r\n\r\n\u003e Before sending a pull request, ensure that you are on the latest version. \u003cbr\u003e \u003e **Open an issue first if you are going to add a new feature to confirm if it's required! You must not be wasting time trying to code a new feature which is not required.**\r\n\r\n\u003c/div\u003e\r\n\r\n\u003cdiv id=\"#issues\"\u003e\r\n\r\n### Issues\r\n\r\nFeel free to [open](https://github.com/ksharinarayanan/SourceWolf/issues/new) any issues you face. \u003cbr\u003e\r\nEnsure that you include your operating system, command which was run, and screenshots if possible while opening an issue, which makes it easier for me to reproduce the issue.\r\n\u003cbr\u003e\r\nYou can also request new features, or enhance existing features by opening an issue.\r\n\r\n\u003c/div\u003e\r\n\r\n\u003cdiv id=\"naming\"\u003e\r\n\r\n### Naming conventions\r\n\r\nTo crawl the files locally, you must follow some naming conventions. These conventions are in place for SourceWolf to directly identify the host name, and thereby parse all the endpoints, including the relative ones.\r\n\r\nConsider an URL `https://example.com/api/`\r\n\r\n-   Remove the protocol and the trailing slash (if any) from the URL --\u003e `example.com/api`\r\n-   Replace '/' with '@' --\u003e `example.com@api`\r\n-   Save the response as a txt file with the file name obtained above.\r\n\r\nSo the file finally looks like `example.com@api.txt`\r\n\r\n\u003c/div\u003e\r\n\r\n### Credits\r\n\r\nLogo designed by \u003ca href=\"https://instagram.com/murugan_artworks\"\u003eMurugan artworks\u003c/a\u003e\r\n\r\n### License\r\n\r\nSourceWolf uses the \u003ca href=\"https://github.com/ksharinarayanan/SourceWolf/blob/master/LICENSE\"\u003eMIT license\u003c/a\u003e\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fksharinarayanan%2FSourceWolf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fksharinarayanan%2FSourceWolf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fksharinarayanan%2FSourceWolf/lists"}