Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/multiplechoice/scrapy-sqs-exporter
Scrapy extension for outputting scraped items to an Amazon SQS instance
https://github.com/multiplechoice/scrapy-sqs-exporter
python3 scrapy sqs
Last synced: about 2 months ago
JSON representation
Scrapy extension for outputting scraped items to an Amazon SQS instance
- Host: GitHub
- URL: https://github.com/multiplechoice/scrapy-sqs-exporter
- Owner: multiplechoice
- License: mit
- Created: 2017-06-07T12:02:28.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2024-09-01T17:18:12.000Z (4 months ago)
- Last Synced: 2024-11-08T05:44:05.408Z (2 months ago)
- Topics: python3, scrapy, sqs
- Language: Python
- Homepage:
- Size: 77.1 KB
- Stars: 6
- Watchers: 1
- Forks: 4
- Open Issues: 5
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
- awesome-scrapy - scrapy-sqs-exporter
README
|Actions Status| |Coveralls Status| |Updates|
scrapy-sqs-exporter
===================This is an extension to Scrapy_ to allow exporting of scraped items to an Amazon SQS instance.
Setup
=====After installing the package, the two classes defined in the library need to be added to the relevant
sections of the settings file::FEED_EXPORTERS = {
'sqs': 'sqsfeedexport.SQSExporter'
}FEED_STORAGES = {
'sqs': 'sqsfeedexport.SQSFeedStorage'
}The ``FEED_STORAGES`` section uses a URL prefixed with ``sqs`` to differentiate it from other URI based storage
options.In the environment we also need to define some keys::
AWS_DEFAULT_REGION=eu-central-1
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
FEED_URI=sqs://foo
FEED_FORMAT=sqsThe ``AWS_ACCESS_KEY_ID`` and ``AWS_SECRET_ACCESS_KEY`` are the AWS credentials to be used, and ``AWS_DEFAULT_REGION``
is the region to default to for the SQS instance. ``FEED_URI`` is the name of the AWS SQS instance in the
``AWS_DEFAULT_REGION`` region for example::AWS_DEFAULT_REGION=us-east-1
FEED_URI=sqs://bar
FEED_FORMAT=sqswould refer to a queue name ``bar`` in the ``us-east-1`` region.
Finally, the ``FEED_FORMAT`` option makes the Scrapy spiders use the SQSExporter class.
.. _Scrapy: https://github.com/scrapy/scrapy/
.. |Actions Status| image:: https://github.com/multiplechoice/scrapy-sqs-exporter/workflows/pytest/badge.svg
.. |Coveralls Status| image:: https://coveralls.io/repos/github/multiplechoice/scrapy-sqs-exporter/badge.svg?branch=master
:target: https://coveralls.io/github/multiplechoice/scrapy-sqs-exporter?branch=master
.. |Updates| image:: https://pyup.io/repos/github/multiplechoice/scrapy-sqs-exporter/shield.svg
:target: https://pyup.io/repos/github/multiplechoice/scrapy-sqs-exporter/
:alt: Updates