Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/documentcloud/cloud-crowd

Parallel Processing for the Rest of Us
https://github.com/documentcloud/cloud-crowd

Last synced: 4 days ago
JSON representation

Parallel Processing for the Rest of Us

Awesome Lists containing this project

README

        

=
_ _
( ` )_
( ) `)
(_ (_ . _) _)
_
( )
_ . ( ` ) . )
( _ )_ (_, _( ,_)_)
(_ _(_ ,)

_ _ ___ _ _ ___ _
( ` )_ / __| |___ _ _ __| |/ __|_ _ _____ __ ____| |
( ) `) | (__| / _ \ || / _` | (__| '_/ _ \ V V / _` |
(_ (_ . _) _) \___|_\___/\_,_\__,_|\___|_| \___/\_/\_/\__,_|

_
( )
_, _ . ( ` ) . )
( ( _ )_ (_, _( ,_)_)
(_(_ _(_ ,)



~ CloudCrowd ~

* Parallel processing for the rest of us
* Write your scripts in Ruby
* Works with Amazon EC2 and S3
* split -> process -> merge
* As easy as `gem install cloud-crowd`

Well-suited for:

* Generating or resizing images.
* Encoding video.
* Running text extraction or OCR on PDFs.
* Migrating a large file set or database.
* Web scraping.


~ Documentation ~

Wiki: https://github.com/documentcloud/cloud-crowd/wiki
Rdoc: http://www.rubydoc.info/github/documentcloud/cloud-crowd


~ Getting started ~

# Install the gem.

>> sudo gem install cloud-crowd

# Install the CloudCrowd configuration files to a location of your choosing.

>> crowd install ~/config/cloud-crowd

# Now, you can use the full complement of `crowd` commands from inside of
# this configuration directory. To see the available commands:

>> crowd --help

# Edit the configuration files to your satisfaction, add AWS credentials,
# and then load the CloudCrowd schema into your configured database.

>> cd ~/config/cloud-crowd
>> mate config.yml
>> mate database.yml
>> [create the database you just configured...]
>> crowd load_schema

# Write your actions, and install them into the 'actions' subdirectory.
# CloudCrowd comes with a few default actions as an example.

# To launch the central server (make sure that you include its location
# in config.yml):

>> crowd server

# The configuration folder also includes 'config.ru', which can be used by
# any Rack-compliant webserver to run your central server.

# Then, to launch a node of workers:

>> crowd node

# To spin up remote nodes, install the 'cloud-crowd' gem and copy over
# your configuration directory. Run `crowd node`, and the remote machines
# will register with the central server, becoming available for processing.

# At this point you can visit your Operations Center at localhost:9173 to
# view all of your nodes, ready for action.