{"id":25965289,"url":"https://github.com/bac0id/wayback-machine-auto-save","last_synced_at":"2026-05-28T01:32:00.898Z","repository":{"id":272719713,"uuid":"917539359","full_name":"bac0id/wayback-machine-auto-save","owner":"bac0id","description":"A crawler to save web pages on list to Save Page Now of Internet Archive's Wayback Machine.","archived":false,"fork":false,"pushed_at":"2025-02-20T13:35:26.000Z","size":29,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-04T21:39:54.703Z","etag":null,"topics":["crawler","internet-archive","python","save-page-now","wayback-machine"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bac0id.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-16T07:18:46.000Z","updated_at":"2025-02-20T13:35:29.000Z","dependencies_parsed_at":"2025-01-16T08:27:58.281Z","dependency_job_id":"94927402-b2b8-4bd9-b615-661a0ea7341e","html_url":"https://github.com/bac0id/wayback-machine-auto-save","commit_stats":null,"previous_names":["bac0id/wayback-machine-auto-save"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bac0id/wayback-machine-auto-save","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bac0id%2Fwayback-machine-auto-save","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bac0id%2Fwayback-machine-auto-save/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bac0id%2Fwayback-machine-auto-save/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bac0id%2Fwayback-machine-auto-save/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bac0id","download_url":"https://codeload.github.com/bac0id/wayback-machine-auto-save/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bac0id%2Fwayback-machine-auto-save/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33590884,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-27T02:00:06.184Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","internet-archive","python","save-page-now","wayback-machine"],"created_at":"2025-03-04T21:39:54.222Z","updated_at":"2026-05-28T01:32:00.893Z","avatar_url":"https://github.com/bac0id.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# wayback-machine-auto-save\n\nA worker to save web pages on list to the Internet Archive's [Wayback Machine](https://web.archive.org/) (WM). \n\n## Limitation\n\nWM seems to allow about **240 successful requests per day per client**, whether the user is logged in or not. This counter resets at 00:00 UTC. \n\n## Quick Start\n\n### File of URLs\n\nPrepare a file containing a list of urls to save. Both `txt` and `json` are accepted. \n\n*   TXT format\n\n    Script loads one line as one url. Example: \n        \n    ```\n    https://www.gnu.org/fun/\n    https://www.gnu.org/fun/jokes/10-kinds-of-people.html\n    ```\n\n*   JSON format\n\n    Script loads all `url` attr value from JSON. Example: \n\n    ```\n    [\n    {\"url\": \"https://www.gnu.org/fun/\"},\n    {\"url\": \"https://www.gnu.org/fun/jokes/10-kinds-of-people.html\"}\n    ]\n    ```\n\n### Save to Wayback Machine\n\nSave urls in `urls.txt` to WM. Command: \n\n```\npython main.py urls.txt\n```\n\nExample output:\n\n```\ncookies: None\nproxies: None\nWaybackMachineAPI inited.\nurls: ['https://www.gnu.org/fun/', 'https://www.gnu.org/fun/jokes/10-kinds-of-people.html']\n\nHttp post: https://web.archive.org/save/https://www.gnu.org/fun/\nstatus_code: 200\nWM accept saving https://www.gnu.org/fun/, job_id: spn2-51ef937fdcccbcf485e2d092417ee320a2043b52\nSave page successful: https://www.gnu.org/fun/, job_id: spn2-51ef937fdcccbcf485e2d092417ee320a2043b52\n\nHttp post: https://web.archive.org/save/https://www.gnu.org/fun/jokes/10-kinds-of-people.html\nstatus_code: 200\nWM accept saving https://www.gnu.org/fun/jokes/10-kinds-of-people.html, job_id: spn2-60a192c5877dd50c7eb416a0565cfc345e6003c0\nSave page successful: https://www.gnu.org/fun/jokes/10-kinds-of-people.html, job_id: spn2-60a192c5877dd50c7eb416a0565cfc345e6003c0\n```\n\n### Optional Arguments\n\n#### Cookies\n\nSimply provide the value of the `logged-in-sig` of cookies. The length of this value is about 300. \n\nArgument: `-c COOKIES`. \n\nExample: `-c \"123456 XXXXXXXXXX\"`\n\n#### Proxy\n\nSet proxy for http and https. \n\nArgument: `-p PROXY`\n\nExample: `-p http://127.0.0.1:8888`\n\n#### Example of Save with Cookies And Proxy\n\n```\npython main.py urls.txt -p http://127.0.0.1:8888 -c \"123456 XXXXXXXXXX\"\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbac0id%2Fwayback-machine-auto-save","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbac0id%2Fwayback-machine-auto-save","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbac0id%2Fwayback-machine-auto-save/lists"}