{"id":13710106,"url":"https://github.com/jts/nanopolish","last_synced_at":"2026-01-23T05:26:39.679Z","repository":{"id":24732875,"uuid":"28145157","full_name":"jts/nanopolish","owner":"jts","description":"Signal-level algorithms for MinION data","archived":false,"fork":false,"pushed_at":"2023-08-05T15:27:20.000Z","size":61030,"stargazers_count":592,"open_issues_count":94,"forks_count":161,"subscribers_count":35,"default_branch":"master","last_synced_at":"2025-12-06T20:22:43.949Z","etag":null,"topics":["bioinformatics","c-plus-plus","epigenetics","genome-assembly","methylation","science"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jts.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2014-12-17T16:23:00.000Z","updated_at":"2025-12-05T15:21:00.000Z","dependencies_parsed_at":"2023-10-20T22:00:58.013Z","dependency_job_id":null,"html_url":"https://github.com/jts/nanopolish","commit_stats":null,"previous_names":[],"tags_count":40,"template":false,"template_full_name":null,"purl":"pkg:github/jts/nanopolish","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jts%2Fnanopolish","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jts%2Fnanopolish/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jts%2Fnanopolish/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jts%2Fnanopolish/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jts","download_url":"https://codeload.github.com/jts/nanopolish/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jts%2Fnanopolish/sbom","scorecard":{"id":540222,"data":{"date":"2025-08-11","repo":{"name":"github.com/jts/nanopolish","commit":"28e774088b4a780b86066571896348e790f4113c"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":2.8,"checks":[{"name":"Code-Review","score":2,"reason":"Found 4/20 approved changesets -- score normalized to 2","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":1,"reason":"0 commit(s) and 2 issue activity found in the last 90 days -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/nanopolish.yaml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/nanopolish.yaml:9: update your workflow using https://app.stepsecurity.io/secureworkflow/jts/nanopolish/nanopolish.yaml/master?enable=pin","Warn: containerImage not pinned by hash: Dockerfile:1: pin your Docker image by updating centos:7 to centos:7@sha256:be65f488b7764ad3638f236b7b515b3678369a5124c47b8d32916d6487418ea4","Warn: containerImage not pinned by hash: Dockerfile-arm:1: pin your Docker image by updating multiarch/ubuntu-debootstrap:arm64-bionic to multiarch/ubuntu-debootstrap:arm64-bionic@sha256:712cb6b22aa7d48ab57f890fa3bf24cd05e71e04540f4226a5a634b7ef2c4181","Info:   0 out of   1 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   2 containerImage dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Vulnerabilities","score":0,"reason":"14 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2021-856 / GHSA-5545-2q6w-2gh6","Warn: Project is vulnerable to: GHSA-6p56-wp2h-9hxr","Warn: Project is vulnerable to: PYSEC-2021-857 / GHSA-f7c7-j99h-c22f","Warn: Project is vulnerable to: GHSA-fpfv-jqm9-f5jm","Warn: Project is vulnerable to: PYSEC-2020-73","Warn: Project is vulnerable to: PYSEC-2020-107 / GHSA-jjw5-xxj6-pcv5","Warn: Project is vulnerable to: PYSEC-2024-110 / GHSA-jw8x-6495-233v","Warn: Project is vulnerable to: PYSEC-2020-108","Warn: Project is vulnerable to: PYSEC-2023-102","Warn: Project is vulnerable to: PYSEC-2023-114","Warn: Project is vulnerable to: PYSEC-2025-49 / GHSA-5rjg-fvgr-3xxf","Warn: Project is vulnerable to: GHSA-cx63-2mw6-8hw5","Warn: Project is vulnerable to: PYSEC-2022-43012 / GHSA-r9hx-vwmv-q579","Warn: Project is vulnerable to: GHSA-g7vv-2v7x-gj9p"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 14 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-20T08:02:07.563Z","repository_id":24732875,"created_at":"2025-08-20T08:02:07.563Z","updated_at":"2025-08-20T08:02:07.563Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28680692,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-23T04:33:33.518Z","status":"ssl_error","status_checked_at":"2026-01-23T04:33:30.433Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","c-plus-plus","epigenetics","genome-assembly","methylation","science"],"created_at":"2024-08-02T23:00:51.906Z","updated_at":"2026-01-23T05:26:39.659Z","avatar_url":"https://github.com/jts.png","language":"C++","funding_links":[],"categories":["Software packages"],"sub_categories":["Poly(A) tail length estimation"],"readme":"# Nanopolish\n\n![build and test](https://github.com/jts/nanopolish/actions/workflows/nanopolish.yaml/badge.svg)\n\nSoftware package for signal-level analysis of Oxford Nanopore sequencing data. Nanopolish can calculate an improved consensus sequence for a draft genome assembly, detect base modifications, call SNPs and indels with respect to a reference genome and more (see Nanopolish modules, below).\n\n\n## A note on R10 support\n\nPresently nanopolish does not support R10.4 flowcells as variant and methylation calling is accurate enough to not require signal-level analysis. We intend to support signal exploration through `eventalign` but do not currently have a timeline for this as our development time is currently dedicated to other projects.\n\n## Release notes\n* 0.14.1: added the `compare_methylation.py` script from the [methylation example data bundle](warwick.s3.climb.ac.uk/nanopolish_tutorial/methylation_example.tar.gz) to the `nanopolish` package\n\n* 0.14.0: support modification bam files, compile on M1 apple hardware, support [SLOW5](https://github.com/hasindu2008/slow5lib) files\n\n* 0.13.3: fix conda build issues, better handling of VBZ-compressed files, integration of module for [nano-COP](https://www.nature.com/articles/s41596-020-00469-y)\n\n* 0.13.2: fix memory leak when loading signal data\n\n* 0.13.1: fix `nanopolish index` performance issue for some barcoding runs\n\n* 0.13.0: modify HMM transitions to allow the balance between insertions and deletions to be changed depending on mode (consensus vs reference variants)\n\n* 0.12.5: make SupportFractionByStrand calculation consistent with SupportFraction\n\n* 0.12.4: add SupportFractionByStrand and SOR to VCF\n\n* 0.12.3: fix hdf5 file handle leak\n\n* 0.12.2: add RefContext info to VCF output of `nanopolish variants`\n\n* 0.12.1: improve how `nanopolish index` handles summary files, add support for selecting reads by BAM read group tags (for `nanopolish variants`)\n\n* 0.12.0: live methylation calling, methylation LLR threshold changes as described [here](http://simpsonlab.github.io/2020/03/03/nanopolish-v0.12.0/)\n\n* 0.11.1: `nanopolish polya` now supports SQK-RNA-002 kits with automatic backwards-compatibility with SQK-RNA-001\n\n* 0.11.0: support for multi-fast5 files. `nanopolish methyltrain` now subsamples input data, improving speed and memory usage\n\n* 0.10.2: added new program `nanopolish polya` to estimate the length of poly-A tails on direct RNA reads (by @paultsw)\n\n* 0.10.1: `nanopolish variants --consensus` now only outputs a VCF file instead of a fasta sequence. The VCF file describes the changes that need to be made to turn the draft sequence into the polished assembly. A new program, `nanopolish vcf2fasta`, is provided to generate the polished genome (this replaces `nanopolish_merge.py`, see usage instructions below). This change is to avoid issues when merging segments that end on repeat boundaries (reported by Michael Wykes and Chris Wright).\n\n## Dependencies\n\nA compiler that supports C++11 is needed to build nanopolish. Development of the code is performed using [gcc-4.8](https://gcc.gnu.org/gcc-4.8/).\n\nBy default, nanopolish will download and compile all of its required dependencies. Some users however may want to use system-wide versions of the libraries. To turn off the automatic installation of dependencies set `HDF5=noinstall`, `EIGEN=noinstall`, `HTS=noinstall` or `MINIMAP2=noinstall` parameters when running `make` as appropriate. The current versions and compile options for the dependencies are:\n\n* [libhdf5-1.8.14](http://www.hdfgroup.org/HDF5/release/obtain5.html) compiled with multi-threading support `--enable-threadsafe`\n* [eigen-3.2.5](http://eigen.tuxfamily.org)\n* [htslib-1.15.1](http://github.com/samtools/htslib)\n* [minimap2-fe35e67](http://github.com/lh3/minimap2)\n* [slow5lib-3680e17](https://github.com/hasindu2008/slow5lib)\n\nIn order to use the additional python3 scripts within `/scripts`, install the dependencies via\n\n```\npip install -r scripts/requirements.txt --user\n```\n\n\n## Installation instructions\n\n### Installing the latest code from github (recommended)\n\nYou can download and compile the latest code from github as follows:\n\n```\ngit clone --recursive https://github.com/jts/nanopolish.git\ncd nanopolish\nmake\n```\n\n### Installing a particular release\n\nWhen major features have been added or bugs fixed, we will tag and release a new version of nanopolish. If you wish to use a particular version, you can checkout the tagged version before compiling:\n\n```\ngit clone --recursive https://github.com/jts/nanopolish.git\ncd nanopolish\ngit checkout v0.9.2\nmake\n```\n\n## Nanopolish modules\n\nThe main subprograms of nanopolish are:\n\n```\nnanopolish call-methylation: predict genomic bases that may be methylated\nnanopolish variants: detect SNPs and indels with respect to a reference genome\nnanopolish variants --consensus: calculate an improved consensus sequence for a draft genome assembly\nnanopolish eventalign: align signal-level events to k-mers of a reference genome\n```\n\n## Analysis workflow examples\n\n### Data preprocessing\n\nNanopolish needs access to the signal-level data measured by the nanopore sequencer. The first step of any nanopolish workflow is to prepare the input data by telling nanopolish where to find the signal files. If you ran Albacore 2.0 on your data you should run `nanopolish index` on your input reads (-d can be specified more than once if using multiple runs):\n\n```\n# Index the output of the basecaller\nnanopolish index -d /path/to/raw_fast5s/ -s sequencing_summary.txt basecalled_output.fastq # for FAST5 inout\nnanopolish index basecalled_output.fastq --slow5 signals.blow5 # for SLOW5 input\n```\n\nThe `-s` option tells nanopolish to read the `sequencing_summary.txt` file from Albacore to speed up indexing. Without this option `nanopolish index` is extremely slow as it needs to read every fast5 file individually. If you basecalled your run in parallel, so you have multiple `sequencing_summary.txt` files, you can use the `-f` option to pass in a file containing the paths to the sequencing summary files (one per line). When using SLOW5 files as the input (FAST5 can be converted to SLOW5 using [slow5tools](https://github.com/hasindu2008/slow5tools)), `-s` option is not required and does not affect indexing performance. \n\n### Computing a new consensus sequence for a draft assembly\n\nThe original purpose of nanopolish was to compute an improved consensus sequence for a draft genome assembly produced by a long-read assembly like [canu](https://github.com/marbl/canu). This section describes how to do this, starting with your draft assembly which should have megabase-sized contigs. We've also posted a tutorial including example data [here](http://nanopolish.readthedocs.io/en/latest/quickstart_consensus.html).\n\n```\n# Index the draft genome\nbwa index draft.fa\n\n# Align the basecalled reads to the draft sequence\nbwa mem -x ont2d -t 8 draft.fa reads.fa | samtools sort -o reads.sorted.bam -T reads.tmp -\nsamtools index reads.sorted.bam\n```\n\nNow, we use nanopolish to compute the consensus sequence (the genome is polished in 50kb blocks and there will be one output file per block). We'll run this in parallel:\n\n```\npython3 nanopolish_makerange.py draft.fa | parallel --results nanopolish.results -P 8 \\\n    nanopolish variants --consensus -o polished.{1}.vcf -w {1} -r reads.fa -b reads.sorted.bam -g draft.fa -t 4 --min-candidate-frequency 0.1\n```\n\nThis command will run the consensus algorithm on eight 50kbp segments of the genome at a time, using 4 threads each. Change the ```-P``` and ```--threads``` options as appropriate for the machines you have available.\n\nAfter all polishing jobs are complete, you can merge the individual 50kb segments together back into the final assembly:\n\n```\nnanopolish vcf2fasta -g draft.fa polished.*.vcf \u003e polished_genome.fa\n```\n\n## Calling Methylation\n\nnanopolish can use the signal-level information measured by the sequencer to detect 5-mC as described [here](https://www.nature.com/articles/nmeth.4184). We've posted a tutorial on how to call methylation [here](http://nanopolish.readthedocs.io/en/latest/quickstart_call_methylation.html).\n\n## To run using docker\n\nFirst build the image from the dockerfile:\n```\ndocker build .\n```\nNote the uuid given upon successful build.\nThen you can run nanopolish from the image:\n```\ndocker run -v /path/to/local/data/data/:/data/ -it :image_id  ./nanopolish eventalign -r /data/reads.fa -b /data/alignments.sorted.bam -g /data/ref.fa\n```\n\n## Credits and Thanks\n\nThe fast table-driven logsum implementation was provided by Sean Eddy as public domain code. This code was originally part of [hmmer3](http://hmmer.janelia.org/). Nanopolish also includes code from Oxford Nanopore's [scrappie](https://github.com/nanoporetech/scrappie) basecaller. This code is licensed under the MPL.\n\nThe `scripts/compare_methylation.py` was originally provided in the [example methylation data bundle](warwick.s3.climb.ac.uk/nanopolish_tutorial/methylation_example.tar.gz) which was obtained using:\n```\ncurl -O warwick.s3.climb.ac.uk/nanopolish_tutorial/methylation_example.tar.gz\ntar xvfz methylation_example.tar.gz\nls methylation_example/compare_methylation.py\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjts%2Fnanopolish","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjts%2Fnanopolish","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjts%2Fnanopolish/lists"}