https://github.com/mg-rast/mym5nr
M5NRv2 -- non-redundant protein and rRNA database integration
https://github.com/mg-rast/mym5nr
protein-sequences rna-seq-data
Last synced: 5 months ago
JSON representation
M5NRv2 -- non-redundant protein and rRNA database integration
- Host: GitHub
- URL: https://github.com/mg-rast/mym5nr
- Owner: MG-RAST
- License: bsd-2-clause
- Created: 2014-05-14T20:20:52.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2021-11-11T21:36:01.000Z (over 4 years ago)
- Last Synced: 2024-03-26T07:02:30.557Z (about 2 years ago)
- Topics: protein-sequences, rna-seq-data
- Language: Perl
- Homepage:
- Size: 10.8 MB
- Stars: 3
- Watchers: 9
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
myM5NR
======
local version of M5NR
## Installation with Docker ##
To build this image:
```bash
git clone https://github.com/MG-RAST/myM5NR.git
```
There are seperate dockerfiles for the different actions available: download, parse, build, upload
They can be built with the following commands:
```bash
docker build -t mgrast/m5nr-download -f download/Dockerfile-download .
docker build -t mgrast/m5nr-parse -f parse/Dockerfile-parse .
docker build -t mgrast/m5nr-build -f build/Dockerfile-build .
docker build -t mgrast/m5nr-upload -f upload/Dockerfile-upload .
```
Examples for manual invocation:
```bash
docker run -t -d --name m5nr-download -v /var/tmp/m5nr:/m5nr_data mgrast/m5nr-download bash
docker run -t -d --name m5nr-parse -v /var/tmp/m5nr:/m5nr_data mgrast/m5nr-parse bash
docker run -t -d --name m5nr-build -v /var/tmp/m5nr:/m5nr_data mgrast/m5nr-build bash
docker run -t -d --name m5nr-upload -v /var/tmp/m5nr:/m5nr_data mgrast/m5nr-upload bash
```
From now steps execute inside the container
Set up some environment bits
```bash
mkdir -p /m5nr_data/Sources
mkdir -p /m5nr_data/Parsed
mkdir -p /m5nr_data/Build
```
To initiate the download (you can use --force to delete old _part directories)
```bash
cd /m5nr_data
/myM5NR/bin/m5nr_compiler.py download --debug 2>&1 | tee /m5nr_data/Sources/logfile.txt
```
To initiate the parsing (work in progress)
```bash
cd /m5nr_data
/myM5NR/bin/m5nr_compiler.py parse --debug 2>&1 | tee /m5nr_data/Parsed/logfile.txt
```
To view status
```bash
cd /m5nr_data
/myM5NR/bin/m5nr_compiler.py status --debug
```
To use automated wrapper script for full round build
```bash
docker exec m5nr-download m5nr_master.sh -a download
docker exec m5nr-parse m5nr_master.sh -a parse
docker exec m5nr-build m5nr_master.sh -a build -v
docker exec m5nr-upload m5nr_master.sh -a upload -v -t
```
To load build data on solr server, run following on same host
```bash
docker exec m5nr-upload docker_setup.sh
docker exec m5nr-upload solr_load.sh -n -i -v -s
```
To load build data on cassandra cluster, run following
```bash
docker exec m5nr-upload cassandra_load.py -n -i -v -t
```
To check table sizes in cassandra for new m5nr build
```bash
CQLSH="/usr/bin/cqlsh --request-timeout 600 --connect-timeout 600"
for T in `docker exec cassandra-simple $CQLSH -e "USE m5nr_v12; describe tables;"`; do echo $T; docker exec cassandra-simple $CQLSH -e "USE m5nr_v12; CONSISTENCY QUORUM; SELECT COUNT(*) FROM $T;"; done
```