Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/audy/cd-hit-that
https://github.com/audy/cd-hit-that
Last synced: about 21 hours ago
JSON representation
- Host: GitHub
- URL: https://github.com/audy/cd-hit-that
- Owner: audy
- Created: 2011-02-12T02:10:53.000Z (over 13 years ago)
- Default Branch: master
- Last Pushed: 2011-08-02T20:33:51.000Z (about 13 years ago)
- Last Synced: 2023-03-11T01:20:52.326Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 137 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# CD-HIT That
- Run CDHIT on a buncha files. Oh yeah, they're paired-end
- Create a table of sequences per cluster per file
- Input data is paired-end, interleaved, FASTA reads which are joined together with their 5' ends touching and clustered.# Usage:
- Reads go in `data/` and have to be in FASTA format
- Filenames need to have a number in them between a `_` and `.`. For example: `reads_blah_blah_033.fasta`. This is for the column header in the output table (it will be `33`). It goes without saying that this number should be unique.To change parameters such has similarity requirement, edit the Rakefile.
The table will be saved as `counts.txt`
Type `rake clean` to start over.