Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/brentp/bsub
python wrapper to submit jobs to bsub (and later qsub)
https://github.com/brentp/bsub
Last synced: 18 days ago
JSON representation
python wrapper to submit jobs to bsub (and later qsub)
- Host: GitHub
- URL: https://github.com/brentp/bsub
- Owner: brentp
- Created: 2012-05-24T15:40:48.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2021-11-12T07:21:39.000Z (about 3 years ago)
- Last Synced: 2024-10-13T02:21:14.597Z (about 1 month ago)
- Language: Python
- Size: 41 KB
- Stars: 44
- Watchers: 8
- Forks: 16
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
bsub
====python wrapper to submit jobs to bsub (and later qsub)
Authors
------
@brentp, @brwnjExample
-------```python
>>> from bsub import bsub
>>> sub = bsub("some_job", R="rusage[mem=1]", verbose=True)# submit a job via call'ing the sub object with the command to run.
# the return value is the numeric job id.
>>> print sub("date").job_id.isdigit()
True# 2nd argument can be a shell script, in which case
# the call() is empty.
#>>> bsub("somejob", "run.sh", verbose=True)()# dependencies:
>>> job_id = bsub("sleeper", verbose=True)("sleep 2").job_id
>>> bsub.poll(job_id)
True```
Sugar
-----For file jobs, we can emulate shell syntax:
```Python
job = bsub('my-job') < 'run.sh'
```Same for text commands:
```Python
"echo hello" | bsub('other-job')
```
Chaining
--------It's possible to specify dependencies to LSF using a flag like:
bsub -w 'done("other-name")' < myjob
We make this more pythonic with:
```Python
>>> j = sub('sleep 1').then('sleep 2')
```
which will wait for the first job `sleep 1` to complete
before running the second job `sleep 2`. These can be chained as:```Python
j = sub('myjob')
j2 = j('sleep 1')
j3 = j2.then('echo "hello"')
j4 = j3.then('echo "world"')
j5 = j4.then('my scripts.p')# or:
j('sleep 1').then('echo "hello"').then('echo "world"')
```
Where each job in `.then()` is not run until the preceding job
is `done()` according to LSF.Bioinformatics example of chaining:
This would submit jobs for positive and negative strand coverage in parallel.
Each strand submitting jobs that run serially.```Python
from bsub import bsub
submit = bsub("bam2bg", verbose=verbose)
# convert bam to stranded bg then bw
sample = "subject_1"
chrom_sizes = "chrom_sizes.txt"# submit jobs by strand for parallel processing
for symbol, strand in zip(["+", "-"], ["pos", "neg"]):bigwig = "%s_%s.bw" % (sample, strand)
bedgraph = "%s_%s.bedgraph" % (sample, strand)bam_to_bg = ("bedtools genomecov -strand %s -bg "
"-ibam %s | bedtools sort -i - > %s") % (symbol, bam, bedgraph)
bg_to_bw = "bedGraphToBigWig %s %s %s" % (bedgraph, chrom_sizes, bigwig)
gzip_bg = "gzip -f %s" % bedgraph# process strand-based steps serially
# submit first 2 jobs to default queue; final job to 'gzip' queue
submit(bam_to_bg).then(bg_to_bw, job_name="bg2bw").then(gzip_bg, "gzipbg", q='gzip')```
Command-Line
------------use the command-line to run jobs with auto-specified err and log files:
```Shell
echo "hello" | python -m bsub -J "fake"
bsub -J fake -e fake.%J.err -o fake.%J.out < /tmp/tmp3vFDwn.sh
```
If a log/ directory exists, the logs will be placed there.the shell script is automatically created and cleaned up after use.