Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hexylena/gene2accession2kyotocabinet
Converts gene2accession file provided by NCBI into a fast kyoto-cabinet database
https://github.com/hexylena/gene2accession2kyotocabinet
Last synced: 16 days ago
JSON representation
Converts gene2accession file provided by NCBI into a fast kyoto-cabinet database
- Host: GitHub
- URL: https://github.com/hexylena/gene2accession2kyotocabinet
- Owner: hexylena
- License: gpl-3.0
- Created: 2015-12-21T19:43:44.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2015-12-21T19:46:21.000Z (almost 9 years ago)
- Last Synced: 2024-10-17T13:58:20.845Z (about 1 month ago)
- Language: Python
- Homepage:
- Size: 13.7 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gene2accession2kyotocabinet
This repository contains a simple makefile and python script for converting
NCBI's gene2accession table into an extremely fast kyotoCabinet database for
searching## Examples
```console
$ make# This step takes a long time. First it extracts the protein + genome GIs into
# the gene2accession.min.tsv file. Then they're converted into a kyoto-cabinet
# database (gene2accession.kch)$ du -h gene2accession.*
845M gene2accession.kch
517M gene2accession.min.tsv
4.8G gene2accession.tsv# At the end, our database is lightning fast:
$ (env)user@host: time python lookup-test.py
3282737 => 3282736
10954455 => 10954454
3282739 => 3282736
10954457 => 10954454
3282740 => 3282736
219730360 => 219730359
219783138 => 219783137
257293830 => 257293829
257287470 => 257287469
10954458 => 10954454
3282741 => 3282736real 0m0.015s
user 0m0.010s
sys 0m0.005s
```## License
GPLv3