{"id":19112583,"url":"https://github.com/ankush-chander/github-crawler","last_synced_at":"2025-08-30T17:36:41.882Z","repository":{"id":71631223,"uuid":"375226289","full_name":"Ankush-Chander/github-crawler","owner":"Ankush-Chander","description":"Crawl information from github in friendly manner.","archived":false,"fork":false,"pushed_at":"2023-10-03T10:07:42.000Z","size":18,"stargazers_count":1,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-20T01:09:53.509Z","etag":null,"topics":["human-resource-analytics","web-crawling"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Ankush-Chander.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-09T04:33:31.000Z","updated_at":"2023-10-03T06:51:16.000Z","dependencies_parsed_at":"2024-11-11T20:02:27.249Z","dependency_job_id":null,"html_url":"https://github.com/Ankush-Chander/github-crawler","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Ankush-Chander/github-crawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ankush-Chander%2Fgithub-crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ankush-Chander%2Fgithub-crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ankush-Chander%2Fgithub-crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ankush-Chander%2Fgithub-crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Ankush-Chander","download_url":"https://codeload.github.com/Ankush-Chander/github-crawler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ankush-Chander%2Fgithub-crawler/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262213224,"owners_count":23276067,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["human-resource-analytics","web-crawling"],"created_at":"2024-11-09T04:33:39.060Z","updated_at":"2025-06-27T07:35:36.740Z","avatar_url":"https://github.com/Ankush-Chander.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!-- PROJECT SHIELDS --\u003e\n\u003c!--\n*** I'm using markdown \"reference style\" links for readability.\n*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).\n*** See the bottom of this document for the declaration of the reference variables\n*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.\n*** https://www.markdownguide.org/basic-syntax/#reference-style-links\n--\u003e\n[![Contributors][contributors-shield]][contributors-url]\n[![Forks][forks-shield]][forks-url]\n[![Stargazers][stars-shield]][stars-url]\n[![Issues][issues-shield]][issues-url]\n[![MIT License][license-shield]][license-url]\n[![LinkedIn][linkedin-shield]][linkedin-url]\n\n\n\n\u003cbr /\u003e\n\u003cp align=\"center\"\u003e\n\n  \u003ch3 align=\"center\"\u003egithub-crawler\u003c/h3\u003e\n\n  \u003cp align=\"center\"\u003e\n    Friendly github crawler. \n    \u003c/p\u003e\n\u003c/p\u003e\n\n# Setup\n1. Install requirements\n```\npip install -r requirement.txt\n```\n2. Update source url as per your need in `github/github/spiders/github-user.py`\n```\ndef start_requests(self):\n\t\turls = [\n\t\t\t\"your search url here\"\n\t\t]\n\n```\n## For CSV (default)\nSet folllowing variables in `settings.py`\n```    \nITEM_PIPELINES = {\n   'GithubCsvPipeline': 300,\n}\n```\n\n## For Elasticsearch\nSet folllowing variables in `settings.py`\n```    \nELASTICSEARCH_HOST = ''\nELASTICSEARCH_PORT = 9200\nITEM_PIPELINES = {\n   'GithubElasticsearchPipeline': 300,\n}\n\n```\nNote: This option requires index to be already created in the elasticsearch server \n\n## For Google sheet:\n1. Set folllowing variables in `settings.py`\n```\nGOOGLE_SHEET =\"\"\nITEM_PIPELINES = {\n   'github.pipeline.GithubExcelPipeline': 300,\n}\n```\n2. Store googleapi credentials in `utility/gsheets_credentials.json`\n\nNote: This option requires an existing google sheet with permissions \"Editable by anyone who has link\"\n\n# Run instructions\n```\ncd github\nscrapy crawl github-user-search\n```\n\n\u003c!-- MARKDOWN LINKS \u0026 IMAGES --\u003e\n\u003c!-- https://www.markdownguide.org/basic-syntax/#reference-style-links --\u003e\n[contributors-shield]: https://img.shields.io/github/contributors/Ankush-Chander/github-crawler.svg?style=for-the-badge\n[contributors-url]: https://github.com/Ankush-Chander/github-crawler/graphs/contributors\n[forks-shield]: https://img.shields.io/github/forks/Ankush-Chander/github-crawler.svg?style=for-the-badge\n[forks-url]: https://github.com/Ankush-Chander/github-crawler/network/members\n[stars-shield]: https://img.shields.io/github/stars/Ankush-Chander/github-crawler.svg?style=for-the-badge\n[stars-url]: https://github.com/Ankush-Chander/github-crawler/stargazers\n[issues-shield]: https://img.shields.io/github/issues/Ankush-Chander/github-crawler.svg?style=for-the-badge\n[issues-url]: https://github.com/Ankush-Chander/github-crawler/issues\n[license-shield]: https://img.shields.io/github/license/Ankush-Chander/github-crawler.svg?style=for-the-badge\n[license-url]: https://github.com/Ankush-Chander/github-crawler/blob/main/LICENSE.txt\n[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge\u0026logo=linkedin\u0026colorB=555\n[linkedin-url]: https://www.linkedin.com/in/ankush-chander-8248a876/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fankush-chander%2Fgithub-crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fankush-chander%2Fgithub-crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fankush-chander%2Fgithub-crawler/lists"}