Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zaycev/cbg-scrapy

Simple HTTP server for asynchronous scrapping data from Twitter API using Twisted library
https://github.com/zaycev/cbg-scrapy

Last synced: about 2 months ago
JSON representation

Simple HTTP server for asynchronous scrapping data from Twitter API using Twisted library

Awesome Lists containing this project

README

        

# CBG Scrapy

CBG Scrapy – is a simple HTTP server for asynchronous scrapping data from Twitter API using Twisted library.

## Installation and running

```bash
$ python scrapy.py [-p ] [-l ]
```

## HTTP API

* ###Adding scrapers

Adds (activates) new scrapers.

URI: `/add/`

GET parameters:

```json
data:
[
{
"name": "LA Scraper",
"oauth": {
"token": "",
"secret": ""
},
"filter": {
"id": "Some integer, unique for each scraper",
"location": [-122.75, 36.8, -121.75, 37.8],
}
}
]
```
Response:

```js
{
"error": true | false,
"message": "Error message"
}
```

* ### Listing scrapers

Returnes state of active scrapers.

URI: `/list/`

GET parameters:


```
none
```

Response:

```js
[
{
"name": "LA scraper",
"token": "",
"status": "connecting" | "connected" | "failed",
"ts_start": "2012.12.12T12:12:00",
"received": 10000,
"total_received": 100000,
"limits": 5000,
"total_limits": 60000,
"rate": 10.4,
"last_received": "2012.12.12T12:12:00",
"filter": {
"track": ["#Python", "#Haskell"],
"follow": [1, 2, 4],
"locations" [0, 0, 0, 0]
},
"errors": [
{
"message": "error message",
"ts": "2012.12.12T12:12:00"
}
]
}
]
```
* ### Removing scrapers

Stops and removes active scrapers.

URI: `/remove/`

GET parameters:

```js
data:
[
""
]
```

Response:

```js
{
"error": true | false,
"message": "Error message"
}
```
* ### Ping

Returns string `pong`.

URI: `/ping/`

GET parameters:

```
none
```

Response:

```
pong
```

* ### Log

Returns log string.

URI: `/log/`

GET parameters:

```
none
```