Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dmclain/scrapy-heroku


https://github.com/dmclain/scrapy-heroku

Last synced: about 1 month ago
JSON representation

Awesome Lists containing this project

README

        

Scrapy-Heroku
=============

A package to assist with running scrapy on heroku. This is accomplished by providing
a custom application configuration at ``scrapy_heroku.app.application`` that launches
the scrapyd web service using the PORT environment variable and a multi-process work
queue implemented on a Postgres database specified by the DATABASE_URL environment
variable.

Configuration
-------------

Create a git repo that has a scrapy project at the root (scrapy.cfg should be at the
top level). Edit your scrapy.cfg to include the following::

```python
[scrapyd]
application = scrapy_heroku.app.application

[deploy]
url = http://.herokuapp.com:80/
project =
username =
password =
```

Add a requirements.txt file that includes ``scrapy``, ``scrapy-heroku``, and ``scrapyd``.
It is strongly recommended that you version pin scrapy-heroku as well as the version of scrapy that
your project is developed against (pip freeze > requirements.txt).

For Example:
```python
# requirements.txt
Scrapy==0.24.4
scrapyd==1.0.1
scrapy-heroku==0.7.1
```

Finally create a Procfile that consists of::

```
web: scrapyd
```

Make sure you have a postgres database with the DATABASE_URL env parameter set.

* Project page: