{"id":19900009,"url":"https://github.com/scrapy-plugins/scrapy-streaming","last_synced_at":"2025-05-02T22:32:07.017Z","repository":{"id":94066585,"uuid":"59220903","full_name":"scrapy-plugins/scrapy-streaming","owner":"scrapy-plugins","description":null,"archived":false,"fork":false,"pushed_at":"2016-10-12T12:45:50.000Z","size":175,"stargazers_count":18,"open_issues_count":8,"forks_count":5,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-07T08:02:03.964Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scrapy-plugins.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-05-19T15:50:20.000Z","updated_at":"2024-10-04T05:04:05.000Z","dependencies_parsed_at":"2023-07-26T16:15:56.788Z","dependency_job_id":null,"html_url":"https://github.com/scrapy-plugins/scrapy-streaming","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapy-plugins%2Fscrapy-streaming","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapy-plugins%2Fscrapy-streaming/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapy-plugins%2Fscrapy-streaming/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapy-plugins%2Fscrapy-streaming/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scrapy-plugins","download_url":"https://codeload.github.com/scrapy-plugins/scrapy-streaming/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252116453,"owners_count":21697381,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T20:10:49.435Z","updated_at":"2025-05-02T22:32:07.010Z","avatar_url":"https://github.com/scrapy-plugins.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Scrapy Streaming (WIP)\n\n[![Build Status](https://travis-ci.org/scrapy-plugins/scrapy-streaming.svg?branch=master)](https://travis-ci.org/scrapy-plugins/scrapy-streaming)\n[![codecov](https://codecov.io/gh/scrapy-plugins/scrapy-streaming/branch/master/graph/badge.svg)](https://codecov.io/gh/scrapy-plugins/scrapy-streaming)\n\nThe Scrapy Streaming provides an interface to write spiders using any programming language, using json objects to make requests, parse web contents, get data, and more.\n\nAlso, we officially provide helper libraries to develop your spiders using Java, JS, and R.\n\n## Quickstart\n\nYou can read a quick tutorial about scrapy-streaming at http://scrapy-streaming.readthedocs.io/en/latest/quickstart.html\n\n## Usage\n\nYou can execute an external spider using the ``streaming`` command, as follows:\n\n    scrapy streaming /path/of/executable\n\nand if you need to use extra arguments, add them using the ``-a`` parameter:\n\n    scrapy streaming my_executable -a arg1 -a arg2 -a arg3,arg4\n\nIf you want to integrate this spider with a scrapy's project, define it in the ``external.json`` file in the root of the project.\nFor example, to add a spider developed in java, and a compiled one, the ``external.json`` can be defined as:\n\n    [\n      {\n        \"name\": \"java_spider\",\n        \"command\": \"java\",\n        \"args\": [\"/home/user/MySpider\"]\n      },\n      {\n        \"name\": \"compiled_spider\",\n        \"command\": \"/home/user/my_executable\"\n      }\n    ]\n\nand then you can execute them using the ``crawl`` command. Inside the project directory, run:\n\n    scrapy crawl spider_name\n\nin this example, ``spider_name`` can be ``java_spider``, ``compiled_spider``, or the name of a Scrapy's spider.\n\n## Documentation\n\nDocumentation is available online at http://scrapy-streaming.readthedocs.io/ and in the docs directory.\n(Temp url, this doc is from the development fork)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscrapy-plugins%2Fscrapy-streaming","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscrapy-plugins%2Fscrapy-streaming","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscrapy-plugins%2Fscrapy-streaming/lists"}