{"id":49017981,"url":"https://github.com/internetarchive/Zeno","last_synced_at":"2026-05-05T06:01:31.000Z","repository":{"id":36955340,"uuid":"289024987","full_name":"internetarchive/Zeno","owner":"internetarchive","description":"State-of-the-art web crawler 🔱","archived":false,"fork":false,"pushed_at":"2026-04-09T02:56:22.000Z","size":3428,"stargazers_count":394,"open_issues_count":50,"forks_count":54,"subscribers_count":8,"default_branch":"main","last_synced_at":"2026-04-09T04:26:50.501Z","etag":null,"topics":["archiving","web-crawler","zeno"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/internetarchive.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-08-20T14:17:09.000Z","updated_at":"2026-04-09T02:55:50.000Z","dependencies_parsed_at":"2026-01-08T08:05:09.980Z","dependency_job_id":null,"html_url":"https://github.com/internetarchive/Zeno","commit_stats":null,"previous_names":["internetarchive/zeno","corentinb/zeno"],"tags_count":53,"template":false,"template_full_name":null,"purl":"pkg:github/internetarchive/Zeno","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/internetarchive%2FZeno","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/internetarchive%2FZeno/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/internetarchive%2FZeno/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/internetarchive%2FZeno/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/internetarchive","download_url":"https://codeload.github.com/internetarchive/Zeno/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/internetarchive%2FZeno/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32637556,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"online","status_checked_at":"2026-05-05T02:00:06.033Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archiving","web-crawler","zeno"],"created_at":"2026-04-19T03:00:28.352Z","updated_at":"2026-05-05T06:01:30.995Z","avatar_url":"https://github.com/internetarchive.png","language":"Go","funding_links":[],"categories":["Go"],"sub_categories":[],"readme":"# Zeno\nState-of-the-art web crawler 🔱\n\n## Introduction\n\nZeno is a web crawler designed to operate wide crawls or to simply archive one web page.\nZeno's key concepts are: portability, performance, simplicity.\nWith an emphasis on performance.\n\nIt heavily relies on the [gowarc](https://github.com/internetarchive/gowarc) module for traffic recording into [WARC](https://iipc.github.io/warc-specifications/) files.\n\nThe name Zeno comes from Zenodotus (Ζηνόδοτος), a Greek grammarian, literary critic, Homeric scholar,\nand the first librarian of the Library of Alexandria.\n\n## Requirements for Building\n- **Go 1.25+** - As specified in go.mod\n- If CGO_ENABLED=1 (enabled by default):\n   \u003e **GCC 12+** - Required for building C++ dependencies with C++20 constexpr support for the WHATWG URL parser ([github.com/ada-url/goada](https://github.com/ada-url/goada)).\n- If CGO_ENABLED=0:\n   \u003e No additional requirements, as the CGO-free WebAssembly wrapper of goada ([goada-wasm](https://github.com/yzqzss/goada-wasm/)) will be used. (1x slower than CGO version on amd64 and arm64, and **10x or more** slower on other CPU architectures! Check https://wazero.io/docs/#compiler for details)\n\nNote: GCC 11 and earlier versions do not support the C++20 constexpr features required by the ada-url/goada dependency. On Ubuntu 22 LTS and earlier, you may need to install a newer GCC version or disable CGO.\n\n## Installation\n\n```bash\ngo install github.com/internetarchive/Zeno@latest\n```\n\nor utilize our pre-built [release binaries here](https://github.com/internetarchive/Zeno/releases), but do note that we are mainly focused on linux/amd64 support at this time.\n\n## Quick Start\n\nTo archive a single web page:\n```bash\nZeno get url https://www.france.fr\n```\n\nZeno is highly configurable with many parameters that can be customized. To see all available configuration options, use `Zeno -h` and/or `Zeno get -h`.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request \u0026 open issues!\n\nZeno is being developed and maintained by the [Internet Archive](https://archive.org) and awesome contributors. The project has evolved into what it is today thanks to the invaluable contributions from the community. While we can't list everyone, special thanks to:\n\n- [Corentin Barreau](https://github.com/CorentinB) former Wayback Machine Software Engineer at the [Internet Archive](https://archive.org) for his initial work on the project.\n- [Jake LaFountain](https://github.com/NGTmeaty), Wayback Machine Software Engineer at the [Internet Archive](https://archive.org).\n- [Thomas Foubert](https://github.com/equals215), former Wayback Machine Platform Engineer at the [Internet Archive](https://archive.org).\n- [yzqzss](https://github.com/yzqzss), Lead Developer of the [Save The Web Project](https://github.com/saveweb).\n- [Will Howes](https://github.com/willmhowes), Wayback Machine Software Engineer at the [Internet Archive](https://archive.org).\n- [Vangelis Banos](https://github.com/vbanos), Wayback Machine Software Engineer at the [Internet Archive](https://archive.org).\n\n## License\n\nThis project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). See the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finternetarchive%2FZeno","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finternetarchive%2FZeno","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finternetarchive%2FZeno/lists"}