{"id":24264604,"url":"https://github.com/wurmlab/npsearch","last_synced_at":"2025-09-24T01:30:57.163Z","repository":{"id":56885833,"uuid":"13729404","full_name":"wurmlab/NpSearch","owner":"wurmlab","description":"NpSearch: Search for Neuropeptides","archived":false,"fork":false,"pushed_at":"2017-01-26T02:14:35.000Z","size":307,"stargazers_count":8,"open_issues_count":0,"forks_count":1,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-08-30T05:49:30.421Z","etag":null,"topics":["cleavage-sites","fasta","neuropeptides","neuropeptides-precursor","neuroscience","ruby","signal-peptides"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wurmlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-10-21T00:10:29.000Z","updated_at":"2025-05-20T13:18:16.000Z","dependencies_parsed_at":"2022-08-20T13:50:51.520Z","dependency_job_id":null,"html_url":"https://github.com/wurmlab/NpSearch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/wurmlab/NpSearch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wurmlab%2FNpSearch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wurmlab%2FNpSearch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wurmlab%2FNpSearch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wurmlab%2FNpSearch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wurmlab","download_url":"https://codeload.github.com/wurmlab/NpSearch/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wurmlab%2FNpSearch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":276678844,"owners_count":25684803,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-23T02:00:09.130Z","response_time":73,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cleavage-sites","fasta","neuropeptides","neuropeptides-precursor","neuroscience","ruby","signal-peptides"],"created_at":"2025-01-15T09:32:30.789Z","updated_at":"2025-09-24T01:30:56.871Z","avatar_url":"https://github.com/wurmlab.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NpSearch (NeuroPeptideSearch)\n[![Build Status](https://travis-ci.org/wurmlab/NpSearch.svg?branch=master)](https://travis-ci.org/wurmlab/NpSearch)\n[![Gem Version](https://badge.fury.io/rb/npsearch.svg)](http://badge.fury.io/rb/npsearch)\n[![Dependency Status](https://gemnasium.com/wurmlab/NpSearch.svg)](https://gemnasium.com/wurmlab/NpSearch)\n\n\u003cstrong\u003ePlease note this currently in beta. We are currently working on something new that is amazingly fast (i.e. a few seconds to run) and a lot better in every sense (it even has an easy-to-use clicky, pointy interface). So watch this place.\u003c/strong\u003e\n\n## Introduction\nNpSearch is a tool that helps identify novel neuropeptides. As such it is not based on homology to existing neuropeptides - rather NpSearch is based on the common characteristics of neuropeptides and their precursors. In other words, it is a feature based tool.\n\nThe results produced includes the entire secretome ordered in the likelihood of the sequence encoding a neuropeptide. As such, it is expected that you only need to analyse the top half of the results. \n\nImportantly, NpSearch produces a highly visual html file where the signal peptide and potential cleavage sites are highlighted. Additionally, NpSearch produces a fasta file of the results (i.e. the ordered secretome) that can easily be used in your own pipelines.\n\nIf you use this program, please cite us:\n\n\u003e Moghul \u003cem\u003eet al.\u003c/em\u003e \u003cem\u003e(in prep)\u003c/em\u003e NpSearch: A Tool to Identify Novel Neuropeptides\n\nNpSearch requires an input of a transcriptomic or predicted proteomic dataset, where each sequence is analysed and awarded a relative score of its likelihood of encoding a neuropeptide precursor. When provided with transcriptomic data, NpSearch translates each contig in all six frames and thereafter extracts all potential open reading frame (methionine to stop codon). Each predicted protein sequence is then analysed for the following neuropeptide-related characteristics:\n\n**Signal peptide**: All neuropeptide precursors must have a signal peptide. This is due to the fact that the final bioactive neuropeptide has to be secreted from the cell of synthesis in order to be functionally active.\n\n**Cleavage sites**: Being derived from a precursor, the bioactive neuropeptide has to be cleaved out from the precursor. Prohormone convertase enzymes cleave these bioactive peptides at specific cleavage sites. As certain cleavage motifs are more likely to be cleaved than other cleavage motifs, NpSearch awards sequences based on the type and number of cleavage sites present.\n\n**C-terminal Glycine**: A significant number of bioactive neuropeptides have a C-terminal glycine that is amidated during post-translation modification. Thus such sequences are awarded with a higher score.\n\n**Repeated peptides**: Numerous neuropeptide precursors are made up of multiple copies of the same neuropeptide. NpSearch attempts to clustering all potential cleaved neuropeptides, and then awarding sequences that produce larger clusters with a higher score.\n\n**Acidic spacer regions**: Neuropeptide precursors that contain multiple neuropeptide copies tend to have highly acidic regions that separate these copies. If detected by NpSearch, the sequence is awarded with a higher score.\n\n\nAfter analysing each sequence in the input dataset, NpSearch produces a visual html file and a fasta file, where sequences that are more likely to encode a neuropeptides precursor are placed at the top of the file. These results files can then be easily inspected and curated by researchers.\n\n\n\n\n\n\n\n## Installation\n\n### Installation Requirements\n* Ruby (\u003e= 2.0.0)\n* SignalP 4.1.*z (Available from [here](http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?signalp))\n* CD-HIT (Available from [here](http://weizhongli-lab.org/cd-hit/) - Suggested Installation via [Homebrew](http://brew.sh) or [Linuxbrew](http://linuxbrew.sh) - `brew install homebrew/science/cd-hit`)\n* EMBOSS (Available from [here](http://emboss.sourceforge.net) - Suggested Installation via [Homebrew](http://brew.sh) or [Linuxbrew](http://linuxbrew.sh) - `brew install homebrew/science/emboss`)\n\n\n## Installation\n\n\u003cstrong\u003eWhile in beta, it is suggested that you run NpSearch from source (i.e. the non-recommended method below)\u003c/strong\u003e\n\nSimply run the following command in the terminal.\n\n```bash\ngem install npsearch\n```\n\nIf that doesn't work, try `sudo gem install npsearch` instead.\n\n##### Running From Source (Not Recommended)\nIt is also possible to run from source. However, this is not recommended.\n\n```bash\n# Clone the repository.\ngit clone https://github.com/wurmlab/npsearch.git\n\n# Move into the NpSearch source directory.\ncd NpSearch\n\n# Install bundler\ngem install bundler\n\n# Use bundler to install dependencies\nbundle install\n\n# Optional: run tests, build documentation and build the gem from source\nbundle exec rake\n\n# Run NpSearch.\nbundle exec npsearch -h\n# note that `bundle exec` executes NpSearch in the context of the bundle\n\n# Alternativaly, install NpSearch as a gem\nbundle exec rake install\nnpsearch -h\n```\n\n\n\n\n## Usage\nVerify NpSearch installed by running the following command in the terminal:\n\n```bash\nnpsearch\n```\n\nYou should see the following output.\n\n```bash\n* Description: A tool to identify novel neuropeptides.\n\n* Usage: npsearch [Options] [Input File]\n\n* Options\n    -s path_to_signalp,              The full path to the signalp script. This can be downloaded from\n        --signalp_path                CBS. See https://www.github.com/wurmlab/NpSearch for more\n                                      information\n    -t, --temp_dir path_to_temp_dir  The full path to the temp dir. NpSearch will create the folder and\n                                      then delete the folder once it has finished using them.\n                                      Default: Hidden folder in the current working directory\n    -n, --num_threads num_of_threads The number of threads to use when analysing the input file\n    -d, --debug                      Run in debug mode\n    -l, --min_orf_length N           The minimum length of a potential neuropeptide precursor.\n                                      Default: 30\n    -m, --max_orf_length N           The maximum length of a potential neuropeptide precursor.\n                                      Default: 600\n    -h, --help                       Display this screen\n    -v, --version                    Shows version\n```\n\n\n### Exemplar Usage Scenario\nThe following runs NpSearch on an input fasta dataset.\n\n```bash\nnpsearch -s /path/to/signalp -n NUM_THREADS INPUT_FASTA_FILE\n```\n\n## Debugging\nHave an issue. No Problemo - Just try the following to produce a debugging log from NpSearch and send this to me at ismail.moghul@gmail.com or raise an issue above.\n\n1. First step would be to uninstall and reinstall npsearch\n\n```bash\ngem uninstall npsearch # Select all when it asks what versions\nto uninstall\ngem install npsearch\nnpsearch --version # you should see 2.1.4\n```\n\n2. Ensure all dependencies are installed.\n\n```bash\n# Check if cd-hit is installed\ncd-hit # you should see an output showing the cd-hit output.\n# Check if `getorf` from the EMBOSS package is installed\ngetorf -version\n# you should see: 'EMBOSS: 6.6.0.0\n```\n\n3. Rerun your analysis with the debug flag (also specify the temporary directory to be on the safe side)\n\n```bash \ncd /path/to/analysis/folder\nmkdir temp\nnpsearch -h # to double check whether npsearch works\nnpsearch -n 10 -s /path/to/signalp/script -d -t\n/path/to/temp/dir /path/to/Trinity.fasta \u003e debug.log\n```\n\n4. Raise an issue (here)[] or send me an email at ismail.moghul@gmail.com. Be sure to attached the debug.log that you have just created and fully explain the issues that you are seeing.\n\n## Note\n\n- With the current version of NpSearch, there is an issue with the number of threads used - it seems to use more threads than that specified in the command line argument \n- NpSearch is expected to produce a high system load (as shown in `top` / `htop`) - this is because NpSearch runs SignalP as a separate process for each sequence (to speed things up). As such the system load (which is the number of processes called per unit time) can be higher than expected. This is normally not a reason for concern - however, we will probably try and find the middle ground between the speed and the number of processes called (or maybe someone could rewrite SignalP in C with multicore support)...","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwurmlab%2Fnpsearch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwurmlab%2Fnpsearch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwurmlab%2Fnpsearch/lists"}