{"id":29054213,"url":"https://github.com/linuxuser255/sitecloner","last_synced_at":"2025-06-27T02:06:57.913Z","repository":{"id":298610788,"uuid":"1000538342","full_name":"LinuxUser255/SiteCloner","owner":"LinuxUser255","description":null,"archived":false,"fork":false,"pushed_at":"2025-06-12T00:28:54.000Z","size":3,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-12T01:31:05.717Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LinuxUser255.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-12T00:20:20.000Z","updated_at":"2025-06-12T00:28:57.000Z","dependencies_parsed_at":"2025-06-12T01:32:49.544Z","dependency_job_id":"76e3f7c6-c656-4ab8-b93c-fdcf89d1e6de","html_url":"https://github.com/LinuxUser255/SiteCloner","commit_stats":null,"previous_names":["linuxuser255/sitecloner"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/LinuxUser255/SiteCloner","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinuxUser255%2FSiteCloner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinuxUser255%2FSiteCloner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinuxUser255%2FSiteCloner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinuxUser255%2FSiteCloner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LinuxUser255","download_url":"https://codeload.github.com/LinuxUser255/SiteCloner/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinuxUser255%2FSiteCloner/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262175243,"owners_count":23270422,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-27T02:06:53.988Z","updated_at":"2025-06-27T02:06:57.899Z","avatar_url":"https://github.com/LinuxUser255.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SiteCloner\n\nA Python utility for creating local copies of websites from Burp Suite sitemap exports.\n\n## Overview\n\nSiteCloner is a tool that processes Burp Suite sitemap XML exports and creates a complete local copy of a website, preserving its structure and content. This is particularly useful for security researchers, penetration testers, and web developers who need offline access to website content.\n\n## Features\n\n- Creates a complete offline copy of websites from Burp Suite sitemap exports\n- Preserves the original website structure\n- Handles both text and binary content (images, PDFs, etc.)\n- Automatically creates necessary directories\n- Sanitizes URL paths to valid filesystem paths\n- Provides error handling and reporting\n\n## Requirements\n\n- Python 3.6+\n- A Burp Suite sitemap XML export file\n\n## Installation\n\nNo special installation is required. Simply clone or download this repository:\n\n```bash\ngit clone https://github.com/LinuxUser255/SiteCloner.git\ncd SiteCloner\n````\n## Usage\n\n1. Export a sitemap from Burp Suite (see instructions below)\n2. Place the exported XML file in the same directory as the script or update the `BURP_XML_FILE` variable\n3. Run the script:\n\n```bash\n# Make executble\nchmod +x clone-from-burp.py\n\n# run it\n./clone-from-burp.py\n\n# or the vscode way ;P\npython3 clone-from-burp.py\n```\n\n4. Find your cloned website in the `cloned_site` directory (or the directory specified in `OUTPUT_DIR`)\n\n## Configuration\n\nYou can modify these variables at the top of the script:\n- `BURP_XML_FILE`: Path to the Burp Suite sitemap XML file (default: 'burp_sitemap.xml')\n- `OUTPUT_DIR`: Directory where the cloned site will be saved (default: 'cloned_site')\n\n## What is a Burp Suite Sitemap XML?\n\nBurp Suite is a popular web application security testing tool. Its sitemap feature keeps track of all the URLs that Burp has encountered while browsing or scanning a website.\n\nThe sitemap XML export contains detailed information about each request and response, including:\n- The URL\n- The HTTP method (GET, POST, etc.)\n- Request headers and body\n- Response headers and body\n- MIME types\n- Status codes\n- And more\n\n\nThis XML file serves as a comprehensive record of a website's structure and content, making it an ideal source for creating offline copies.\nHow to Export a Sitemap from Burp Suite\n\n1. Open Burp Suite and browse the target website (manually or using the Spider tool)\n2. Go to the \"Target\" tab and select the \"Site map\" sub-tab\n3. Right-click on the host you want to export\n4. Select \"Save selected items\" from the context menu\n5. Choose \"Base64-encode requests and responses\" if you want to preserve binary content\n6. Save the file as \"burp_sitemap.xml\" in your project directory\n\n### How the Script Works\n_The clone-from-burp.py script:_\n\n\n1. Parses the Burp Suite sitemap XML file\n2. For each item in the sitemap:\n   Extracts the URL and response content\n   Determines if the response is base64-encoded\n   Decodes the content appropriately\n   Creates a suitable file path based on the URL\n   Saves the content to the appropriate location\n\n3. Provides a summary of the operation when complete\n\n### Example Output Structure\nFor a website like example.com with various pages, the output structure might look like:\n\n```\ncloned_site/\n└── example.com/\n    ├── index.html\n    ├── about.html\n    ├── contact.html\n    ├── images/\n    │   ├── logo.png\n    │   └── banner.jpg\n    ├── css/\n    │   └── style.css\n    └── js/\n        └── script.js\n```\n**Limitations**\n\nThe script can only clone pages that were captured in the Burp Suite sitemap\nDynamic content that requires JavaScript execution won't function in the offline copy\nSome relative links might not work correctly in the offline version\nThe script doesn't modify links to point to local resources\n\n\u003cbr\u003e\n\n**Contributing**\n\n\u003cbr\u003e\n\n**Contributions are welcome!**\n\n\u003cbr\u003e\n\n**Please feel free to submit a Pull Request.**\n\n\u003cbr\u003e\n\n**License GPL3**\n\n\u003cbr\u003e\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinuxuser255%2Fsitecloner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinuxuser255%2Fsitecloner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinuxuser255%2Fsitecloner/lists"}