Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/syagev/weiz-grid

A simple framework for using the SGE cluster at Weizmann's CS department for parallel work
https://github.com/syagev/weiz-grid

Last synced: 14 days ago
JSON representation

A simple framework for using the SGE cluster at Weizmann's CS department for parallel work

Awesome Lists containing this project

README

        

###############################################################################
## WeizGrid
## By Stav Yagev, 2013 (Please contact me if you find bugs !)
##
##
## Simple framework for using the SGE cluster on Weizmann for parallel work.
##
## FEATURES:
## - Helps split your own code into fractions that can be run in parallel on the cluster.
## - Allows you to aggregate results in an efficient manner.
## - Write your code ONCE! Execute the same code on PC for debugging and on the cluster
## for production.
## - Recover from errors on the cluster - splitting means if 1 iteration out of a
## 1000 failed you still have 999 iterations in your hand!
##
## Use case example:
## You have a job that processes 1000 images in some way, running the same algorithm
## for each image, outputing something for each image, and finally aggregating the
## results from all images. Up until now, you ran this in a single job that took 1000
## minutes (because lets assume it takes 1 minute to process each image). Using WeizGrid
## you can take advantage of the fact that the job can be parallelized - instead of 1
## job WeizGrid helps you easily split the job to 80 parallel jobs so that you get all
## 1000 images in 1000/80=12.5 minutes (!!)
##

Auto Installation:
==================
1. Copy all the files to ~/WeizGrid
2. In a UNIX terminal, type:
chmod +x ~/WeizGrid/wgsetup
3. Then type:
~/WeizGrid/wgsetup

Usage:
======
1. Create files in the spirit of 'sample.m' and 'calcPrimes.m' (make sure
the WG m-files are in MATLAB's path)
2. Upload your files to UNIX
3. Under UNIX, from the direcotry of YOUR project, run:
~/WeizGrid/qsubWG [name] [your-m-file] [queue]

Example: ~/WeizGrid/qsubWG ParallelCracker sample all.q

4. Enjoy!

Manual Installation (if auto doesn't work...):
=============================================
1. Copy all files to ~/WeizGrid.
2. Make sure all bash scripts have execute permission.
2. Create the following directory structure (1 folder for each queue you intend to use):
~/.matlab/cluster_jobs/all.q
~/.matlab/cluster_jobs/test.q
...
NOTE: The ~/.matlab/ usually already exists but is hidden, so make sure you
are viewing hidden files and folders .

Tips & Troubleshooting:
=======================
- If your algorithm uses a random number, make sure you are not setting the RngShuffle
option to false when invoking WGexec - because this will yield the same random
series for every "parallel" piece of work.
- Sometimes, during the aggregation part of your script, there will be a bug. Then,
you might think all your work is lost! This is not the case, look into part #2
of the documentation of WGgetResults for more information.
- There isn't verbose error checking, so if something doesn't work, check the output
files on your UNIX home for hints... If still unsuccessful, feel free to contact
me for help.