{"id":24072987,"url":"https://github.com/venkatamutyala/wordpress-plugins-crawler-scrapy","last_synced_at":"2026-01-29T10:04:04.888Z","repository":{"id":37063216,"uuid":"119299123","full_name":"venkatamutyala/wordpress-plugins-crawler-scrapy","owner":"venkatamutyala","description":"Scrapy scripts to crawl all WordPress.org plugins","archived":false,"fork":false,"pushed_at":"2024-06-05T23:53:05.000Z","size":15,"stargazers_count":3,"open_issues_count":5,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-10-08T14:52:21.715Z","etag":null,"topics":["scrapy","scrapy-crawler","scrapy-spider","webscraper","wordpress","wordpress-plugin-crawler"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/venkatamutyala.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-01-28T21:24:26.000Z","updated_at":"2023-04-29T20:02:15.000Z","dependencies_parsed_at":"2025-05-29T20:48:07.129Z","dependency_job_id":null,"html_url":"https://github.com/venkatamutyala/wordpress-plugins-crawler-scrapy","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/venkatamutyala/wordpress-plugins-crawler-scrapy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/venkatamutyala%2Fwordpress-plugins-crawler-scrapy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/venkatamutyala%2Fwordpress-plugins-crawler-scrapy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/venkatamutyala%2Fwordpress-plugins-crawler-scrapy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/venkatamutyala%2Fwordpress-plugins-crawler-scrapy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/venkatamutyala","download_url":"https://codeload.github.com/venkatamutyala/wordpress-plugins-crawler-scrapy/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/venkatamutyala%2Fwordpress-plugins-crawler-scrapy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28875446,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-29T09:47:23.353Z","status":"ssl_error","status_checked_at":"2026-01-29T09:47:19.357Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["scrapy","scrapy-crawler","scrapy-spider","webscraper","wordpress","wordpress-plugin-crawler"],"created_at":"2025-01-09T17:24:41.268Z","updated_at":"2026-01-29T10:04:04.871Z","avatar_url":"https://github.com/venkatamutyala.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NOTE: THIS PROJECT IS NO LONGER IN ACTIVE DEVELOPMENT. Please ensure that you update all the libraries prior to execution. One or more of the libraries in this project may have security vulnerabilities. \n\n\n# WordPress Plugin crawler using Scrapy\n\n\n\n### Development Environment setup\nDeveloped with Python 3.6.3\n```\n    $ virtualenv venv -p python3\n    $ source venv/bin/activate\n    $ pip install -r requirements.txt\n```\n\nNotes:\nThe main.py file was added to help make it easier to interactively debug in pycharm.\nThe default output format is newline delimited json.\n\nTo run:\n```\n    $ scrapy crawl WordPressPlugins\n```\nBy default output will be stored in: \"YYYY-MM-DD.ndjson\"\n\n### Export the variables below to save to AWS S3:\n```\n    $ export AWS_ACCESS_KEY_ID=AKIAXXXXXXXXXXXXXX\n    $ export AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXX\n    $ export AWS_DEFAULT_REGION=XXXXXXXX\n    $ export SCRAPY_WORDPRESS_FEED_URI=\"s3://el-gato-public/scrapy/wordpress-plugins/\"`date +%F`\".ndjson\"\n```\n\n\nOther:\nYou are also welcome to hit my bucket directly at: s3://el-gato-public/scrapy/wordpress-plugins/*\n**** Please be aware that I have enabled requestor pays on the bucket.\n\n\nIf you have any questions feel free to reach out.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvenkatamutyala%2Fwordpress-plugins-crawler-scrapy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvenkatamutyala%2Fwordpress-plugins-crawler-scrapy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvenkatamutyala%2Fwordpress-plugins-crawler-scrapy/lists"}