{"id":28359948,"url":"https://github.com/maxiee/pagesnap","last_synced_at":"2025-08-06T10:07:58.214Z","repository":{"id":161884318,"uuid":"636513906","full_name":"maxiee/pagesnap","owner":"maxiee","description":"Saving webpage in a single HTML file, based on playwright","archived":false,"fork":false,"pushed_at":"2023-05-06T15:18:27.000Z","size":15,"stargazers_count":24,"open_issues_count":1,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-06-04T16:23:59.187Z","etag":null,"topics":["playwright","playwright-python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/maxiee.png","metadata":{"files":{"readme":"README.md","changelog":"changelog.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-05T02:46:08.000Z","updated_at":"2025-05-02T14:51:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"fca2e0d5-dc2e-4bcb-9cf5-d9f482da450f","html_url":"https://github.com/maxiee/pagesnap","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/maxiee/pagesnap","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxiee%2Fpagesnap","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxiee%2Fpagesnap/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxiee%2Fpagesnap/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxiee%2Fpagesnap/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/maxiee","download_url":"https://codeload.github.com/maxiee/pagesnap/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxiee%2Fpagesnap/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259786240,"owners_count":22910905,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["playwright","playwright-python"],"created_at":"2025-05-28T10:10:43.727Z","updated_at":"2025-06-14T08:32:19.384Z","avatar_url":"https://github.com/maxiee.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PageSnap\n\n[中文文档](./README_zh.md)\n\nPageSnap is a tool that allows you to save web pages offline as single-page HTML files, preserving the original appearance of the web page as much as possible. It is developed using Python and Playwright, which means it can also save dynamic JavaScript web pages offline.\n\nThe advantage of using single-page HTML format is that users can conveniently open and browse the files with any W3C-compliant browser.\n\nAs a Python library, PageSnap can be easily added as a dependency to other projects. If your project uses Python and Playwright, simply import the PageSnap library to add offline-saving capabilities to your pages.\n\nNote: Currently, this project is still in the early stages of feature development and cannot be directly used as a library. The related APIs are still under development. You can keep an eye on the progress, or feel free to clone the project to try it out and share your thoughts.\n\n## As a Library\n\nPageSnap provides an asyncio-based API. In your Playwright project, you can complete the offline saving of a page in just two steps. Here's an example code:\n\n```python\n# Step1: Hook page to intercept requests and save resources\n#        note: you can also hook after goto, but you may miss some resources\nawait hook_page(page) \n\n# Develop your code, doing your actions\nawait page.goto(url)\n# It's better to wait for the page to be fully loaded\nawait page.wait_for_load_state(\"networkidle\")\n\n# Step2: Get the page content\nembedded_html = await page_snap(page)\n\n# Then you can save it to a file\nwith open(output_filename, 'w', encoding='utf-8') as f:\n    f.write(embedded_html)\n```\n\n## As a Command Line\n\nUse the following pip command to install:\n\n```\npip install pagesnap\n```\n\nInitialize Playwright:\n\n```\nplaywright install\n```\n\nStart using:\n\n```\npagesnap https://example.com/ test.html\n```\n\n## Discussion\n\nIf you have any suggestions or improvements, please feel free to submit an issue or pull request. If you like this project, please give it a star.\n\nI am usually active on [Sina Weibo](https://www.weibo.com/u/1240212845) and welcome technical discussions there as well.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxiee%2Fpagesnap","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaxiee%2Fpagesnap","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxiee%2Fpagesnap/lists"}