{"id":21455970,"url":"https://github.com/sntran/gen_spider","last_synced_at":"2026-02-28T04:37:40.900Z","repository":{"id":62429762,"uuid":"147904982","full_name":"sntran/gen_spider","owner":"sntran","description":"An Erlang/Elixir behaviour to define Spiders","archived":false,"fork":false,"pushed_at":"2018-09-18T03:20:00.000Z","size":34,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-30T23:31:16.355Z","etag":null,"topics":["behaviour","crawler","generic","interface","spider"],"latest_commit_sha":null,"homepage":"https://hex.pm/packages/gen_spider","language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sntran.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-09-08T05:40:52.000Z","updated_at":"2025-08-26T21:35:41.000Z","dependencies_parsed_at":"2022-11-01T20:07:05.243Z","dependency_job_id":null,"html_url":"https://github.com/sntran/gen_spider","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/sntran/gen_spider","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sntran%2Fgen_spider","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sntran%2Fgen_spider/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sntran%2Fgen_spider/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sntran%2Fgen_spider/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sntran","download_url":"https://codeload.github.com/sntran/gen_spider/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sntran%2Fgen_spider/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29924775,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-27T19:37:42.220Z","status":"online","status_checked_at":"2026-02-28T02:00:07.010Z","response_time":90,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["behaviour","crawler","generic","interface","spider"],"created_at":"2024-11-23T05:14:08.773Z","updated_at":"2026-02-28T04:37:40.868Z","avatar_url":"https://github.com/sntran.png","language":"Elixir","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GenSpider\n\n[![Build Status](https://img.shields.io/travis/sntran/gen_spider/master.svg)](https://travis-ci.org/sntran/gen_spider)\n[![Test Coverage](https://img.shields.io/coveralls/github/sntran/gen_spider.svg)](https://coveralls.io/github/sntran/gen_spider)\n[![Hex Version](https://img.shields.io/hexpm/v/gen_spider.svg)](https://hex.pm/packages/gen_spider)\n[![License](https://img.shields.io/github/license/sntran/gen_spider.svg)](https://choosealicense.com/licenses/apache-2.0/)\n\n\u003c!-- MDOC !--\u003e\n\nGenSpider is a behaviour for defining Spiders.\n\nSpiders are modules which define how a certain site (or a group of sites) will\nbe scraped, including how to perform the crawl (i.e. follow links) and how to\nextract structured data from their pages (i.e. scraping items). In other words,\nSpiders are the place where you define the custom behaviour for crawling and\nparsing pages for a particular site (or, in some cases, a group of sites).\n\n## Hello World\n\nThe basic Quotes Spider from Scrapy is implemented with `gen_spider` in both\n[Erlang](examples/quotes_spider.erl) and [Elixir](examples/quotes_spider.ex).\n\n\u003c!-- MDOC !--\u003e\n\n## Generic Spiders\n\nGenSpider also comes with some useful generic spiders that can be found in the\n[examples](examples) directory. Their aim is to provide convenient functionality\nfor a few common scraping cases, like following all links on a site based on\ncertain rules, crawling from Sitemaps, or parsing an XML/CSV feed.\n\n\u003c!-- MDOC !--\u003e\n\n## Installation\n\nIf [available in Hex](https://hex.pm/docs/publish), the package can be installed\nby adding `gen_spider` to your list of dependencies in `mix.exs`:\n\n```elixir\ndef deps do\n  [\n    {:gen_spider, \"~\u003e 0.1.0\"}\n  ]\nend\n```\n\nDocumentation can be generated with [ExDoc](https://github.com/elixir-lang/ex_doc)\nand published on [HexDocs](https://hexdocs.pm). Once published, the docs can\nbe found at [https://hexdocs.pm/gen_spider](https://hexdocs.pm/gen_spider).\n\n## Contributing\n\nWe welcome everyone to contribute to GenSpider and help us tackle existing issues!\n\nUse the [issue tracker][issues] for bug reports or feature requests. Open a [pull request][pulls] when you are ready to contribute.\n\nWhen submitting a pull request you should not update the `CHANGELOG.md`.\n\n## License\n\nGenSpider source code is released under Apache 2 License.\nCheck LICENSE file for more information.\n\n[issues]: https://github.com/sntran/gen_spider/issues\n[pulls]: https://github.com/sntran/gen_spider/pulls\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsntran%2Fgen_spider","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsntran%2Fgen_spider","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsntran%2Fgen_spider/lists"}