{"id":22973042,"url":"https://github.com/stevieflyer/quokka","last_synced_at":"2025-07-12T12:09:16.835Z","repository":{"id":192749563,"uuid":"687291357","full_name":"stevieflyer/quokka","owner":"stevieflyer","description":"An easy-to-use web crawler framework, supporting parallel crawling without a line of code and headless running.","archived":false,"fork":false,"pushed_at":"2023-09-09T08:07:31.000Z","size":29,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-02T07:14:20.875Z","etag":null,"topics":["crawler","parallel","web-automation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stevieflyer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-09-05T03:57:41.000Z","updated_at":"2024-12-19T02:12:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"51b8e268-1c6e-47ab-981f-085552df5920","html_url":"https://github.com/stevieflyer/quokka","commit_stats":{"total_commits":4,"total_committers":1,"mean_commits":4.0,"dds":0.0,"last_synced_commit":"70d0c5582cdcd39756fa34440f91f880b7199afc"},"previous_names":["stevieflyer/quokka"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/stevieflyer/quokka","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stevieflyer%2Fquokka","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stevieflyer%2Fquokka/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stevieflyer%2Fquokka/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stevieflyer%2Fquokka/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stevieflyer","download_url":"https://codeload.github.com/stevieflyer/quokka/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stevieflyer%2Fquokka/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264987457,"owners_count":23693833,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","parallel","web-automation"],"created_at":"2024-12-14T23:38:31.792Z","updated_at":"2025-07-12T12:09:16.814Z","avatar_url":"https://github.com/stevieflyer.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Quokka - Browser Automation Library with Playwright\n\nQuokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and manipulation tasks. It provides a convenient facade for various browser interactions, making it easier to navigate web pages, extract data, and interact with page elements.\n\n## Key Features\n\n- **Asynchronous and Parallel Execution:** Quokka operates entirely in an asynchronous manner. Leveraging the power of Playwright, it utilizes multiple processes, each containing a single coroutine, for efficient parallel execution. This architecture excels in handling both IO and CPU-bound workloads when ample resources are available.\n- **Multi-threaded Crawling with Ease:** Quokka's `BaseCrawler` class enables users to effortlessly transition from single-threaded to multi-threaded crawling. By taking advantage of the provided crawler template, you can seamlessly convert a single-threaded crawler into a multi-threaded one.\n- Easy Browser Management: Quokka's `Agent` class provides a streamlined interface for managing browser instances, including starting, stopping, and page navigation.\n- Data Extraction: With the `data_extractor` module, Quokka allows you to easily extract data from web pages using customizable selectors and extraction patterns.\n- Page Interaction: The `page_interactor` module enables you to interact with web page elements, such as clicking, typing, and scrolling, making automation tasks a breeze.\n- Custom Hooks: Quokka supports customizable hooks, allowing you to extend and customize the behavior of the `Agent` class to fit your specific needs.\n- Extensible: Quokka exposes Playwright's `playwright` and `page` instances, enabling users to extend the library's functionality as required.\n\n## Installation\n\n```bash\npip install quokka-web\n```\n\n## Getting Started\nQuokka's intuitive API makes browser automation a straightforward process. Here's a simple example:\n\n```python\nfrom quokka_web import Agent\n\n\nasync def main():\n    agent = await Agent.instantiate(headless=True)\n    await agent.start()\n\n    # Your automation code here\n\n    await agent.stop()\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    asyncio.run(main())\n```\n\n## Documentation\n\nFor detailed usage instructions, examples, and customization options, please refer to the [Documentation](link_to_documentation).\n\n## Examples\n\nBase Crawler Example:\n\n```python\nfrom quokka_web import BaseCrawler, Debugger\n\n\nclass MyCrawler(BaseCrawler):\n    async def _crawl(self, *args, **kwargs):\n# Core crawling logic using browser_agent\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n\n    async def main():\n        crawler = await MyCrawler.instantiate(debug_tool=Debugger(verbose=True))\n        await crawler.start()\n        await crawler.crawl()\n        await crawler.stop()\n\n\n    asyncio.run(main())\n```\n## Contributing\n\nContributions to Quokka are welcome! Please read our [Contribution Guidelines](link_to_contribution_guidelines) for more information on how to contribute to the project.\n\n## License\n\nThis project is licensed under the [MIT License](link_to_license).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstevieflyer%2Fquokka","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstevieflyer%2Fquokka","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstevieflyer%2Fquokka/lists"}