{"id":15583132,"url":"https://github.com/audy/dna","last_synced_at":"2025-10-07T02:51:22.229Z","repository":{"id":2324488,"uuid":"3285254","full_name":"audy/dna","owner":"audy","description":"A biological sequence file (fasta, fastq, qseq) parser for Ruby","archived":false,"fork":false,"pushed_at":"2015-10-30T19:56:20.000Z","size":258,"stargazers_count":4,"open_issues_count":2,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-24T02:52:47.634Z","etag":null,"topics":["bioinformatics","dna","parser","ruby"],"latest_commit_sha":null,"homepage":"audy.github.com/dna","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/audy.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2012-01-27T19:00:28.000Z","updated_at":"2020-11-28T21:02:21.000Z","dependencies_parsed_at":"2022-09-05T14:50:44.146Z","dependency_job_id":null,"html_url":"https://github.com/audy/dna","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy%2Fdna","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy%2Fdna/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy%2Fdna/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/audy%2Fdna/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/audy","download_url":"https://codeload.github.com/audy/dna/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250552037,"owners_count":21449162,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","dna","parser","ruby"],"created_at":"2024-10-02T20:05:02.421Z","updated_at":"2025-10-07T02:51:17.201Z","avatar_url":"https://github.com/audy.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DNA [![Gem Version](https://badge.fury.io/rb/dna.png)](http://badge.fury.io/rb/dna) [![Build Status](https://secure.travis-ci.org/audy/dna.png?branch=master)](http://travis-ci.org/audy/dna) [![Coverage Status](https://coveralls.io/repos/audy/dna/badge.png)](https://coveralls.io/r/audy/dna)\nA biological sequence file parser for Ruby\n\nAustin G. Davis-Richardson\n\nFeatures\n\n  - Supported Formats ([submit a format request](https://github.com/audy/dna/issues/new?title=request%20for%20new%20format))\n    - [fasta](http://en.wikipedia.org/wiki/FASTA)\n    - [fastq](http://en.wikipedia.org/wiki/Fastq)\n    - [qseq](http://blog.kokocinski.net/index.php/qseq-files-format?blog=2)\n  - Automatic format detection\n  - Lazy iteration\n\n## Installation\n\nTested on Ruby 1.9.3 and 2.0.0\n\n```\n$ (sudo) gem install dna\n```\n\n## Usage\n\n```ruby\n\nrequire 'dna'\n\n# Automatic Format Detection \n\nFile.open('sequences.fasta') do |handle|\n  records = Dna.new handle\n\n  records.each do |record|\n    puts record.length\n  end\nend\n\nFile.open('sequences.fastq') do |handle|\n  records = Dna.new handle\n\n  records.each do |record|\n    puts record.quality\n  end\nend\n\nFile.open('sequences.qseq') do |handle|\n  records = Dna.new handle\n  puts records.first.inspect\nend\n\n# **caveat:** If you are reading from a compressed file\n# or `stdin` you MUST specify the sequence format:\n\nrequire 'zlib'\n\nZlib::GzipReader('sequences.fasta.gz') do |handle|\n  records = Dna.new handle, :format =\u003e :fasta\n\n  records.each do |record|\n    puts record.length\n  end\nend\n```\n\n## Support for PHRED score parsing\n\n```ruby\n\n# Illumina \u003e 1.3)\n\nrecord.illumina_qualities # =\u003e [31, ..., 37]\n\n# Error probabilities\n\nrecord.illumina_probabilities\n# =\u003e [1.0, 0.7943282347242815, ...,  0.3981071705534972]\n\n# Solexa + Illumina =\u003c 1.3\n\nrecord.solexa_qualities\nrecord.solexa_probabilities\n\n# Sanger\n\nrecord.sanger_qualities\nrecord.sanger_probabilities\n\n```\n\n## Bonus Feature\n\nThe DNA gem is also a command-line tool with grep-like capabilities. Print records with (Ruby) regexp match in header.\n\n```\n$ dna spec/data/input.fastq \"[1-2]\"\n\n@1\nTGAAACTTATTGATCACCCCGCTTGGCGTTGGGGAGAAATTCAGAAAAGAGTGCTTGATGGGGCGCCACATGCCGTGCAACCCACTCTCTTTCACGCAGCGCGCCCCA\n+1\n5888.6778888650/-//\u0026,(,./*-11'//0\u0026,-0.(.,,,,/2/\u0026-,,,,,.(.,(,..\u0026---\u0026-,,,((*-----*+.\u0026,,,,,(//\u0026,,,-(,,+(,,,--\u0026(\n@2\nGTCGCGGCTTACCACCCAACGATTTTTTTTAGAGGTGCTGGTTTCA\n+2\n2550//*-1./4.--/'+.2.,,,,,,,,\u0026(/00.11426554+13\n\n$ dna spec/data/test.fasta \"\\d\"\n\n\u003e1\nGAGAGATCTCATGACACAGCCGAAG\n\u003e2\nGAGACAUAUCCNNNAA\n\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faudy%2Fdna","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faudy%2Fdna","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faudy%2Fdna/lists"}