{"id":37027761,"url":"https://github.com/crawler-commons/url-frontier","last_synced_at":"2026-01-14T03:18:22.138Z","repository":{"id":38748319,"uuid":"208624821","full_name":"crawler-commons/url-frontier","owner":"crawler-commons","description":"API definition, resources and reference implementation of URL Frontiers","archived":false,"fork":false,"pushed_at":"2025-11-12T09:20:58.000Z","size":1314,"stargazers_count":52,"open_issues_count":3,"forks_count":12,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-11-12T11:23:56.376Z","etag":null,"topics":["grpc","url-frontier","urlfrontier","web-crawlers","webcrawling"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/crawler-commons.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2019-09-15T16:44:36.000Z","updated_at":"2025-11-12T09:21:03.000Z","dependencies_parsed_at":"2024-11-01T15:26:34.708Z","dependency_job_id":"aca0743d-fd58-42eb-8aa7-179c3931cb11","html_url":"https://github.com/crawler-commons/url-frontier","commit_stats":null,"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/crawler-commons/url-frontier","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crawler-commons%2Furl-frontier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crawler-commons%2Furl-frontier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crawler-commons%2Furl-frontier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crawler-commons%2Furl-frontier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/crawler-commons","download_url":"https://codeload.github.com/crawler-commons/url-frontier/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crawler-commons%2Furl-frontier/sbom","scorecard":{"id":307607,"data":{"date":"2025-08-11","repo":{"name":"github.com/crawler-commons/url-frontier","commit":"f343a0e667a1bd88e32032e45b9fd31a3eaeda30"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":4.3,"checks":[{"name":"Code-Review","score":4,"reason":"Found 9/20 approved changesets -- score normalized to 4","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Maintained","score":4,"reason":"4 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 4","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/dockerhub.yml:1","Warn: no topLevel permission defined: .github/workflows/maven.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Packaging","score":10,"reason":"packaging workflow detected","details":["Info: Project packages its releases by way of GitHub Actions.: .github/workflows/dockerhub.yml:9"],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact 2.3 not signed: https://api.github.com/repos/crawler-commons/url-frontier/releases/80781752","Warn: release artifact 2.1 not signed: https://api.github.com/repos/crawler-commons/url-frontier/releases/67281200","Warn: release artifact 2.3 does not have provenance: https://api.github.com/repos/crawler-commons/url-frontier/releases/80781752","Warn: release artifact 2.1 does not have provenance: https://api.github.com/repos/crawler-commons/url-frontier/releases/67281200"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Pinned-Dependencies","score":3,"reason":"dependency not pinned by hash detected -- score normalized to 3","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/dockerhub.yml:14: update your workflow using https://app.stepsecurity.io/secureworkflow/crawler-commons/url-frontier/dockerhub.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/dockerhub.yml:18: update your workflow using https://app.stepsecurity.io/secureworkflow/crawler-commons/url-frontier/dockerhub.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/dockerhub.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/crawler-commons/url-frontier/dockerhub.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven.yml:18: update your workflow using https://app.stepsecurity.io/secureworkflow/crawler-commons/url-frontier/maven.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven.yml:20: update your workflow using https://app.stepsecurity.io/secureworkflow/crawler-commons/url-frontier/maven.yml/master?enable=pin","Warn: containerImage not pinned by hash: Dockerfile:1","Warn: containerImage not pinned by hash: Dockerfile:21: pin your Docker image by updating openjdk:11-jdk-slim to openjdk:11-jdk-slim@sha256:868a4f2151d38ba6a09870cec584346a5edc8e9b71fde275eb2e0625273e2fd8","Info:   0 out of   3 GitHub-owned GitHubAction dependencies pinned","Info:   3 out of   5 third-party GitHubAction dependencies pinned","Info:   0 out of   2 containerImage dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 25 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-17T22:28:04.321Z","repository_id":38748319,"created_at":"2025-08-17T22:28:04.321Z","updated_at":"2025-08-17T22:28:04.321Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28408824,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T01:52:23.358Z","status":"online","status_checked_at":"2026-01-14T02:00:06.678Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["grpc","url-frontier","urlfrontier","web-crawlers","webcrawling"],"created_at":"2026-01-14T03:18:21.566Z","updated_at":"2026-01-14T03:18:22.133Z","avatar_url":"https://github.com/crawler-commons.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"logo.svg\" alt=\"URL Frontier\" width=\"350\"/\u003e\n\n[![license](https://img.shields.io/github/license/crawler-commons/url-frontier)](http://www.apache.org/licenses/LICENSE-2.0)\n![Build Status](https://github.com/crawler-commons/url-frontier/actions/workflows/maven.yml/badge.svg)\n[![Docker Image Version (latest semver)](https://img.shields.io/docker/v/crawlercommons/url-frontier)](https://hub.docker.com/r/crawlercommons/url-frontier)\n\nDiscovering content on the web is possible thanks to web crawlers, luckily there are many excellent open-source solutions for this; however, most of them have their own way of storing and accessing the information about the URLs.\n\nThe aim of the *URL Frontier* project is to develop a crawler/language-neutral API for the operations that web crawlers do when communicating with a web frontier e.g. get the next URLs to crawl, update the information about  URLs already processed, change the crawl rate for a particular hostname, get the list of active hosts, get statistics, etc... Such an API can used by a variety of web crawlers, regardless of whether they are implemented in Java like [StormCrawler](http://stormcrawler.net) and [Heritrix](https://github.com/internetarchive/heritrix3) or in Python like [Scrapy](https://scrapy.org/).\n\nThe outcomes of the project are to:\n- design an **[API](API/README.md)** with [gRPC](http://grpc.io), provide a Java stubs for the API and instructions on how to achieve the same for other languages\n- deliver a robust reference implementation of the URL Frontier **[service](service/README.md)**\n- implement a command line **[client](client/README.md)** for basic interactions with a service\n- provide a **[test suite](tests/README.md)** to check that any implementation of the API behaves as expected\n\nOne of the objectives of URL Frontier is to involve as many actors in the web crawling community as possible and get real users to give continuous feedback on our proposals. \n\nPlease use the [project mailing list](https://groups.google.com/g/crawler-commons) or [Discussions](https://github.com/crawler-commons/url-frontier/discussions) section for questions, comments or suggestions. \n\nThere are many ways to [get involved](https://github.com/crawler-commons/url-frontier/wiki/Ways-to-help) if you want to.\n\nThis project is funded through the [NGI0 Discovery Fund](https://nlnet.nl/discovery), a fund established by NLnet with financial support from the European Commission's [Next Generation Internet programme](https://ngi.eu/), under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825322. \n\n![NLNet](https://nlnet.nl/image/logo_nlnet.svg)\n\u003cbr\u003e\n\u003cimg src=\"https://nlnet.nl/image/logos/NGI0Discovery_tag.svg\" alt=\"NGI0\" height=\"80\"/\u003e\n\n# License information\n\nThis project is available as open source under the terms of Apache 2.0. For accurate information, please check individual files.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrawler-commons%2Furl-frontier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcrawler-commons%2Furl-frontier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrawler-commons%2Furl-frontier/lists"}