{"id":21900287,"url":"https://github.com/cryptoc1/earl","last_synced_at":"2026-05-18T10:36:00.471Z","repository":{"id":43044039,"uuid":"392597931","full_name":"Cryptoc1/earl","owner":"Cryptoc1","description":"Earl is looking for URLs in your area.","archived":false,"fork":false,"pushed_at":"2024-09-03T22:09:16.000Z","size":379,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"develop","last_synced_at":"2025-09-06T22:43:28.449Z","etag":null,"topics":["crawler","middleware","nuget","webscraping"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Cryptoc1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-04T07:47:40.000Z","updated_at":"2022-03-06T20:14:18.000Z","dependencies_parsed_at":"2025-01-27T06:37:55.881Z","dependency_job_id":null,"html_url":"https://github.com/Cryptoc1/earl","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Cryptoc1/earl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cryptoc1%2Fearl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cryptoc1%2Fearl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cryptoc1%2Fearl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cryptoc1%2Fearl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Cryptoc1","download_url":"https://codeload.github.com/Cryptoc1/earl/tar.gz/refs/heads/develop","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cryptoc1%2Fearl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279007135,"owners_count":26084246,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-11T02:00:06.511Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","middleware","nuget","webscraping"],"created_at":"2024-11-28T15:07:11.535Z","updated_at":"2025-10-11T12:09:49.536Z","avatar_url":"https://github.com/Cryptoc1.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eearl\u003c/h1\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n*Looking for URLs in your area.*\n\n![Language](https://img.shields.io/github/languages/top/cryptoc1/earl)\n[![Checks](https://img.shields.io/github/checks-status/cryptoc1/earl/develop)](https://github.com/Cryptoc1/earl/actions/workflows/default.yml)\n[![Coverage](https://img.shields.io/codecov/c/github/cryptoc1/earl)](https://app.codecov.io/gh/Cryptoc1/earl/)\n[![Version](https://img.shields.io/nuget/vpre/Earl.Crawler)](https://www.nuget.org/packages/Earl.Crawler)\n\n\u003c/div\u003e\n\nEarl is a suite of APIs for developing url crawlers \u0026 web scrapers driven by a middleware pattern similar to, and strongly influenced by, ASP.NET Core.\n\n## Basic Usage\n\n```csharp\nvar services = new ServiceCollection()\n    .AddEarlCrawler()\n    .AddEarlJsonPersistence()\n    .BuildServiceProvider();\n\nvar crawler = services.GetService\u003cIEarlCrawler\u003e();\nvar options = CrawlerOptionsBuilder.CreateDefault()\n    .BatchSize( 50 )\n    .MaxRequestCount( 500 )\n    .On\u003cCrawlUrlResultEvent\u003e( \n        ( CrawlUrlResultEvent e, CancellationToken cancellation ) =\u003e\n        {\n            Console.WriteLine( $\"Crawled {e.Result.Url}\" );\n            return default;\n        }\n    )\n    .Timeout( TimeSpan.FromMinutes( 30 ) )\n    .Use(\n        ( CrawlUrlContext context, CrawlUrlDelegate next ) =\u003e\n        {\n            Console.WriteLine( $\"Executing delegate middleware while crawling {context.Url}\" );\n            return next( context );\n        }\n    )\n    .PersistTo( persist =\u003e persist.ToJson( json =\u003e json.Destination(...) ) )\n    .Build();\n\nawait crawler.CrawlAsync( new Uri(...), options );\n```\n\n## Documentation\n\nDocumentation can be find within the READMEs of the sub-directories representing the conceptual components of Earl:\n\n- [Events](https://github.com/Cryptoc1/earl/tree/develop/src/Crawler/Events/README.md)\n- [Middleware](https://github.com/Cryptoc1/earl/tree/develop/src/Crawler/Middleware/README.md)\n- [Persistence](https://github.com/Cryptoc1/earl/tree/develop/src/Crawler/Persistence/README.md)\n\nAll public APIs *should* contain thorough XML (triple slash) comments. \n\n\u003e *Something missing, still have questions? Please open an Issue or submit a PR!*","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcryptoc1%2Fearl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcryptoc1%2Fearl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcryptoc1%2Fearl/lists"}