{"id":13845537,"url":"https://github.com/robotshell/robotScraper","last_synced_at":"2025-07-12T02:31:51.173Z","repository":{"id":199442830,"uuid":"377506337","full_name":"robotshell/robotScraper","owner":"robotshell","description":"RobotScraper is a simple tool written in Python to check each of the paths found in the robots.txt file and what HTTP response code they return.","archived":false,"fork":false,"pushed_at":"2024-07-23T06:56:50.000Z","size":383,"stargazers_count":11,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-11-15T07:37:22.775Z","etag":null,"topics":["bounty-hunting-tools","bugbounty","hacking","infosec","python","robots","scraper","tool"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/robotshell.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-06-16T13:30:48.000Z","updated_at":"2024-10-24T21:07:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"8d6c4e1e-4d9b-46f2-bbfd-9f4d2efa90b3","html_url":"https://github.com/robotshell/robotScraper","commit_stats":null,"previous_names":["robotshell/robotscraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robotshell%2FrobotScraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robotshell%2FrobotScraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robotshell%2FrobotScraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robotshell%2FrobotScraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/robotshell","download_url":"https://codeload.github.com/robotshell/robotScraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225784750,"owners_count":17523702,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bounty-hunting-tools","bugbounty","hacking","infosec","python","robots","scraper","tool"],"created_at":"2024-08-04T17:03:27.741Z","updated_at":"2024-11-21T18:32:15.026Z","avatar_url":"https://github.com/robotshell.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e\n  \u003cbr\u003e\n  \u003ca href=\"https://github.com/robotshell/robotScraper\"\u003e\u003cimg src=\"https://i.ibb.co/41MDdWD/robotscraper.png\" alt=\"robotScraper\" style=\"width:100%\"\u003e\u003c/a\u003e\n\u003c/h1\u003e\n\n## Description\n\nRobotScraper is an open-source tool designed to scrape and analyze the `robots.txt` file of a specified domain. This Python script helps in identifying directories and pages that are allowed or disallowed by the `robots.txt` file and can save the results if needed. It is useful for web security researchers, SEO analysts, and anyone interested in examining the structure and access rules of a website.\n\n## Requirements\n\n- Python 3.x\n- `requests` package\n- `beautifulsoup4` package\n\n## Installation\n\n1. Clone the repository:\n    ```sh\n    git clone https://github.com/robotshell/robotScraper\n    cd robotScraper\n    ```\n\n2. Install the required Python packages:\n    ```sh\n    pip install requests beautifulsoup4\n    ```\n\n## Usage\n\nTo run the RobotScraper, you can use the following command syntax:\n\n```sh\npython robotScraper.py domain [-s output.txt]\n```\n\n# Disclaimer\nThis tool is intended for educational and research purposes only. The author and contributors are not responsible for any misuse of this tool. Users are advised to use this tool responsibly and only on systems for which they have explicit permission. Unauthorized access to systems, networks, or data is illegal and unethical. Always obtain proper authorization before conducting any kind of activities that could impact other users or systems.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobotshell%2FrobotScraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frobotshell%2FrobotScraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobotshell%2FrobotScraper/lists"}