{"id":28388251,"url":"https://github.com/HenryLok0/AnyDownload","last_synced_at":"2025-06-27T05:32:28.238Z","repository":{"id":295232154,"uuid":"989524205","full_name":"HenryLok0/AnyDownload","owner":"HenryLok0","description":"A powerful command-line tool to download an entire website—including HTML, images, CSS, JS, fonts, and media—into a local folder for offline browsing.","archived":false,"fork":false,"pushed_at":"2025-05-30T17:08:55.000Z","size":121,"stargazers_count":4,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-05-31T01:14:59.912Z","etag":null,"topics":["cli","download","html","html-download","https","nodejs","puppeteer","web-scraper","website","website-downloader"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HenryLok0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-24T09:18:17.000Z","updated_at":"2025-05-30T17:08:58.000Z","dependencies_parsed_at":"2025-05-31T01:15:03.790Z","dependency_job_id":"90b81927-db60-4895-a6f9-e18205533231","html_url":"https://github.com/HenryLok0/AnyDownload","commit_stats":null,"previous_names":["henrylok0/website-downloader","henrylok0/anydownload"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/HenryLok0/AnyDownload","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenryLok0%2FAnyDownload","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenryLok0%2FAnyDownload/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenryLok0%2FAnyDownload/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenryLok0%2FAnyDownload/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HenryLok0","download_url":"https://codeload.github.com/HenryLok0/AnyDownload/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenryLok0%2FAnyDownload/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262198091,"owners_count":23273790,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","download","html","html-download","https","nodejs","puppeteer","web-scraper","website","website-downloader"],"created_at":"2025-05-30T21:09:18.983Z","updated_at":"2025-06-27T05:32:28.223Z","avatar_url":"https://github.com/HenryLok0.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AnyDownload\n\n[![Code Size](https://img.shields.io/github/languages/code-size/HenryLok0/AnyDownload?style=flat-square\u0026logo=github)](https://github.com/HenryLok0/AnyDownload)\n[![npm version](https://img.shields.io/npm/v/anydownload?style=flat-square)](https://www.npmjs.com/package/anydownload)\n\n[![MIT License](https://img.shields.io/github/license/HenryLok0/AnyDownload?style=flat-square)](LICENSE)\n[![Stars](https://img.shields.io/github/stars/HenryLok0/AnyDownload?style=flat-square)](https://github.com/HenryLok0/AnyDownload/stargazers)\n\nA powerful and efficient website downloader support both `Puppeteer` and `Playwright` that allows you to download entire websites with a single command. Perfect for offline browsing, archiving, or learning web development.\n\n---\n\n## Key Features\n\n- **High Performance**: Fast concurrent downloads and efficient resource management\n- **Dynamic Website Support**: Download modern JavaScript-heavy sites using Puppeteer or Playwright\n- **Comprehensive Resource Capture**: HTML, CSS, JS, images, fonts, media, and more\n- **User-Friendly Web GUI**: Configure and monitor downloads visually\n- **Recursive Download**: Configurable depth for linked pages\n- **Advanced Filtering**: Download only what you need\n- **Authentication**: Supports login flows (form-based)\n- **Resume, Proxy, Speed Limit, Sitemap, and More**\n\n---\n\n## Installation\n\n```bash\n# Using npm\nnpm install -g anydownload\n\n# Or clone the repository\ngit clone https://github.com/HenryLok0/AnyDownload\ncd AnyDownload\nnpm install\n```\n\n\u003e **Note:** If you want to use Playwright, you may need to install browser binaries:\n\u003e ```bash\n\u003e npx playwright install\n\u003e ```\n\n---\n\n## Docker\n\nYou can run AnyDownload easily with Docker.\n\n### 1. Build the Docker image\n\n```bash\ndocker build -t anydownload .\n```\n\n### 2. Run the Web GUI\n\n```bash\ndocker run -p 3000:3000 anydownload\n```\n\nThen visit [http://localhost:3000](http://localhost:3000) in your browser.\n\n### 3. Run CLI mode (with output folder mounted)\n\n```bash\ndocker run --rm -v $(pwd)/output:/app/output anydownload anydownload https://example.com -o output\n```\n\n### Dockerfile Example\n\n```dockerfile\nFROM node:20-alpine\n\nWORKDIR /app\n\nCOPY package*.json ./\nRUN npm install --production\n\nCOPY . .\n\nEXPOSE 3000\nCMD [\"node\", \"web-gui.js\"]\n```\n\n---\n\n## Basic Usage\n\n```bash\n# Download a website (default: Puppeteer)\nanydownload https://example.com\n\n# Use Playwright as the browser engine\nanydownload https://example.com --dynamic --browser playwright\n\n# Or using the repository\nnode bin/cli.js https://example.com --browser puppeteer\nnode bin/cli.js https://example.com --browser playwright\n```\n\n## Web Interface\n\nStart the web GUI for a visual download experience:\n\n```bash\nanydownload --gui\n# Or\nnode web-gui.js\n```\n\nThen visit [http://localhost:3000](http://localhost:3000) in your browser.\n\n---\n\n## Advanced Examples\n\n### Download Full Website(About all sitemap pages)\n```bash\nanydownload https://example.com --browser playwright --dynamic --sitemap --recursive\n```\n\n### Download with Login\n```bash\nanydownload https://example.com --login-url https://example.com/login --login-form '{\"#username\": \"username\", \"#password\": \"password\"}' --login-credentials '{\"username\": \"user\", \"password\": \"pass\"}' --browser playwright\n```\n\n### Download with Custom Output\n```bash\nanydownload https://example.com --output mysite --browser puppeteer\n```\n\n### Download with Depth Control\n```bash\nanydownload https://example.com --recursive --max-depth 2 --browser playwright\n```\n\n### Download Specific Resources\n```bash\nanydownload https://example.com --type image --type css --browser puppeteer\n```\n\n### Dynamic Website Download\n```bash\nanydownload https://example.com --dynamic true --browser playwright\n```\n\n---\n\n### AnyDownloadSupports Both Puppeteer and Playwright\n\nAnyDownload supports **both [Puppeteer](https://pptr.dev/)** and **[Playwright](https://playwright.dev/)** as browser engines for dynamic website rendering.  \nYou can freely choose which engine to use with the `--browser` option.\n\n### What's the difference between Puppeteer and Playwright?\n\n| Feature                | Puppeteer                        | Playwright                              |\n|------------------------|----------------------------------|-----------------------------------------|\n| Supported Browsers     | Chromium (Chrome, Edge)          | Chromium, Firefox, WebKit (Safari)      |\n| Stealth/Evasion        | Good (with plugins)              | Good, often less detectable             |\n| Multi-browser Support  | Limited                          | Excellent (cross-browser)               |\n| API Similarity         | Industry standard                | Very similar, but more advanced options |\n| Stability              | Very stable                      | Very stable                             |\n| Use Case               | Most dynamic sites               | Sites that block Puppeteer, or need Safari/Firefox support |\n\n- **Puppeteer** is great for most dynamic websites and is widely used.\n- **Playwright** is recommended if you need to handle websites that block Puppeteer, require Firefox or Safari/WebKit rendering, or need more advanced browser automation features.\n\n**All features of AnyDownload are available in both modes!**\n\n---\n\n## Configuration Options\n\n| Option | Description | Default |\n|--------|-------------|---------|\n| `--output, -o` | Custom output folder | `downloaded_site` |\n| `--recursive, -r` | Download linked pages | `false` |\n| `--max-depth, -m` | Set recursion depth | `1` |\n| `--type` | Resource types to download | `all` |\n| `--dynamic` | Enable dynamic mode | `false` |\n| `--verbose` | Show detailed logs | `false` |\n| `--schedule` | Schedule automatic downloads | `none` |\n| `--browser` | Choose browser engine (`puppeteer` or `playwright`) | `puppeteer` |\n| `--concurrency` | Max concurrent downloads | `5` |\n| `--delay` | Delay between requests | `1000ms` |\n| `--retry` | Retry count for failed downloads | `3` |\n| `--proxy` | Use proxy server | `none` |\n| `--speed-limit` | Download speed limit | `0` |\n| `--resume` | Enable resume download | `false` |\n| `--sitemap` | Generate sitemap | `false` |\n| `--timeout` | Request timeout | `30000ms` |\n| `--max-file-size` | Maximum file size | `0` |\n| `--retry-delay` | Retry delay | `1000ms` |\n| `--validate-ssl` | SSL validation | `true` |\n| `--follow-redirects` | Follow redirects | `true` |\n| `--max-redirects` | Maximum redirects | `5` |\n| `--keep-original-urls` | Keep original URLs | `false` |\n| `--clean-urls` | Clean URLs | `false` |\n| `--ignore-errors` | Ignore errors | `false` |\n| `--parallel-limit` | Parallel download limit | `5` |\n| `--login-url` | Login page URL | `null` |\n| `--login-form` | Login form field mapping | `null` |\n| `--login-credentials` | Login credentials | `null` |\n\n---\n\n## FAQ\n\n### Q: Should I use Puppeteer or Playwright?\nA:  \n- Use **Puppeteer** for most dynamic websites (Chromium/Chrome-based).\n- Use **Playwright** if you need to download sites that block Puppeteer, require Firefox/Safari/WebKit, or want more stealth/cross-browser support.\n\n### Q: What is the easiest way to download an entire website (including all sitemap pages)?\nA: Use the command `anydownload https://example.com --browser playwright --dynamic --sitemap --recursive`\n\nIt will:\n- Read `sitemap_index.xml`\n- Parse all sub-sitemaps\n\n### Q: How to handle websites with login?\nA: Use the `--login-url`, `--login-form`, and `--login-credentials` options. Both Puppeteer and Playwright support login automation.\n\n### Q: Do I need to install browsers for Playwright?\nA: Yes, run `npx playwright install` after installing dependencies.\n\n### Q: Are all features available in both engines?\nA: Yes! All download, filtering, login, and automation features work with both Puppeteer and Playwright.\n\n---\n\n## Contributing\n\nWe welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\n### Contributors\n\n\u003ca href=\"https://github.com/HenryLok0/AnyDownload/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=HenryLok0/AnyDownload\" /\u003e\n\u003c/a\u003e\n\n## License\n\nMIT License - see [LICENSE](LICENSE) for details.\n\n## Support\n\n- GitHub Issues: [Open an issue](https://github.com/HenryLok0/AnyDownload/issues)\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=HenryLok0/AnyDownload\u0026type=Date)](https://star-history.com/#HenryLok0/AnyDownload\u0026Date)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FHenryLok0%2FAnyDownload","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FHenryLok0%2FAnyDownload","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FHenryLok0%2FAnyDownload/lists"}