Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ushahidi/suckapy
The Python port of sucka.
https://github.com/ushahidi/suckapy
Last synced: 2 months ago
JSON representation
The Python port of sucka.
- Host: GitHub
- URL: https://github.com/ushahidi/suckapy
- Owner: ushahidi
- Created: 2014-04-19T04:06:47.000Z (almost 11 years ago)
- Default Branch: master
- Last Pushed: 2015-03-16T04:20:56.000Z (almost 10 years ago)
- Last Synced: 2024-11-04T08:35:59.911Z (3 months ago)
- Language: Python
- Size: 5.9 MB
- Stars: 20
- Watchers: 18
- Forks: 6
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-starred - ushahidi/suckapy - The Python port of sucka. (others)
README
**Note: *This is the Python port of [sucka](https://github.com/ushahidi/sucka).
# Sucka
#### Sucking in the world's crisis data. Byte by byte.
Sucka can retrieve information from any source and transform that data to the structure used through the CrisisNET system. One crisis API to rule them all.
Each source has a corresponding `sucka` module that understands where the third-party data is, how to get it, and how that data is structured. This third-party source could be a public API (like Twitter), or a more "static" dataset, like a CSV of incident reports created by an NGO.
## Writing your own sucka
### For Experienced Python Devs
*1. Clone this repo and install dependencies from `requirements.txt`
$ git clone https://github.com/ushahidi/suckapy.git
$ pip install -r requirements.txt*2. Create a module in the `suckas` package.
$ cd src/suckas && touch my_awesome_sucka.pyIn `my_awesome_sucka.py` Define a `suck` function that accepts two arguments
def suck(save_item, handle_error):
# raw_data = retrieve data from API, file, whatever
# item = transform data to a dict formatted like an Item
# save_item(item)So first you get data (like from an API, or CSV file), then you transform each row/record/etc from the data you retrieve into an `Item` (which is the structure used throughout CrisisNET to respresent anything with a time or place -- like an attack, or a tornado sighting, or the location of an open pharmacy. [Here's how an Item should be structured](https://github.com/ushahidi/suckapy/blob/master/src/cn_store_py/models.py)). Once you have an Item, you pass it to the `save_item` function.
*3. Add a `definition` property that tells the system how often this `sucka` should be run.
definition = {
'internalID': 'b43be343-fca5-4415-b424-19e21468c33d',
'sourceType': 'gdelt',
'language': 'python',
'frequency': 'repeats',
'repeatsEvery': 'day',
'startDate': datetime.strptime('20140420', "%Y%m%d"),
'endDate': datetime.now() + timedelta(days=365)
}Note that `internalID` needs to be unique, so we recommend generating a uuid.
import uuid
uuid.uuid4()*4. Add a test to the `test` directory like `test_yourmodule.py` to verify that your `sucka` successfully and reliably creates `Item` documents.
*5. Create a new branch for your feature and make a pull request
Contact us if you have any questions.