Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ssi-dk/serum_readfilter
https://github.com/ssi-dk/serum_readfilter
Last synced: 3 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/ssi-dk/serum_readfilter
- Owner: ssi-dk
- License: gpl-3.0
- Created: 2017-11-28T14:05:26.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2018-02-19T15:30:39.000Z (over 6 years ago)
- Last Synced: 2023-03-02T02:46:32.995Z (over 1 year ago)
- Language: Python
- Size: 59.8 MB
- Stars: 7
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# serum_readfilter
serum_readfilter is a program designed to filter whole genome sequence data to only obtain reads that may touch your gene's of interest. This is accomplished by creating a database off a set of sequences you want to filter by (ie MLST, resistance genes, etc) and then filtering (via Kraken https://github.com/DerrickWood/kraken or Kaiju https://github.com/bioinformatics-centre/kaiju) the raw reads against this database. By default with Kraken this will filter all reads without 1 k-mer match to the database which can reduce your data set substantially while still retaining almost all options that would normally be a potential match to the database. This is very useful when you want to map reads against a gene or have a large set of data to work with and want to make it more managable.
To use:
serum_readfilter makedb kraken -db -i
serum_readfilter runfilter kraken -db -R1 -R2
bioRxiv link:
https://www.biorxiv.org/content/early/2018/02/15/266080