{"id":13724555,"url":"https://github.com/gnames/gnverifier","last_synced_at":"2026-03-16T14:14:19.925Z","repository":{"id":41168418,"uuid":"297323648","full_name":"gnames/gnverifier","owner":"gnames","description":"GNverifier verifies scientific names against more than 100 biodiversity databases","archived":false,"fork":false,"pushed_at":"2024-04-13T12:29:25.000Z","size":3020,"stargazers_count":19,"open_issues_count":9,"forks_count":1,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-10-29T22:31:35.537Z","etag":null,"topics":["biodiversity","bioinformatics","go","golang","reconciliation","resolution","scientific-names","verification"],"latest_commit_sha":null,"homepage":"https://verifier.globalnames.org","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gnames.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2020-09-21T11:49:47.000Z","updated_at":"2024-09-14T18:08:03.000Z","dependencies_parsed_at":"2023-12-17T20:06:58.932Z","dependency_job_id":"3497d593-ad5e-4d9d-8013-ca56e247cf43","html_url":"https://github.com/gnames/gnverifier","commit_stats":{"total_commits":146,"total_committers":1,"mean_commits":146.0,"dds":0.0,"last_synced_commit":"c6ae072dbe7d891f4acfa35414d189ac1124bc46"},"previous_names":[],"tags_count":53,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fgnverifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fgnverifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fgnverifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fgnverifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gnames","download_url":"https://codeload.github.com/gnames/gnverifier/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224552317,"owners_count":17330241,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["biodiversity","bioinformatics","go","golang","reconciliation","resolution","scientific-names","verification"],"created_at":"2024-08-03T01:01:59.248Z","updated_at":"2026-02-01T09:15:58.474Z","avatar_url":"https://github.com/gnames.png","language":"Go","funding_links":[],"categories":["Biosphere"],"sub_categories":["Biodiversity Data Cleaning and Standardization"],"readme":"# Global Names Verifier\n\n[![DOI](https://zenodo.org/badge/297323648.svg)](https://doi.org/10.5281/zenodo.5111542)\n\nTry `GNverifier` [online][web-service].\n\n[GNverifier with OpenRefine]\n\n[GNverifier API]\n\n[Feedback]\n\n`GNverifier` validates scientific names by checking them against a variety of biodiversity [Data Sources][data_source_ids]. It accepts individual names or batch lists and returns verification results showing whether names are found, their taxonomic status, and matching records. An [advanced search feature](#advanced-search-query-language) enables complex queries with filters for authorship, year, and other criteria.\n\nResults are returned in JSON, CSV, TSV, or HTML (web interface only) formats.\n\n## Understanding Verification Results\n\n**BestResult**: The highest-scoring match found for a name. In most cases, this single result provides the most reliable verification outcome. However, when multiple equally reliable matches exist, see **BestResults**.\n\n**BestResults**: A list containing multiple equally high-scoring matches. This field is populated only when verification finds multiple equally reliable results (e.g., hemihomonyms—identical names under different nomenclatural codes, or ambiguous synonyms). This field remains empty when there is only one best match.\n\n**AllResults**: The complete set of all matches found across all data sources. This can be extensive and is only returned when explicitly requested via the `--all-matches` CLI flag or the corresponding web UI setting.\n\n## Vernacular (Common) Names\n\nTo retrieve vernacular names alongside scientific names, use iNaturalist (data source ID `180`) and specify desired languages via the `--vernaculars` CLI option or web UI setting. Provide languages as comma-separated ISO 639-3 three-letter codes (e.g., `eng,spa,fra` for English, Spanish, and French).\n\n\n\u003c!-- vim-markdown-toc GFM --\u003e\n\n* [Citing](#citing)\n* [Features](#features)\n* [Installation](#installation)\n  * [Using Homebrew on Mac OS X, Linux, and Linux on Windows ([WSL2])](#using-homebrew-on-mac-os-x-linux-and-linux-on-windows-wsl2)\n  * [MS Windows](#ms-windows)\n  * [Linux and Mac (without Homebrew)](#linux-and-mac-without-homebrew)\n  * [Compile from source](#compile-from-source)\n* [Usage](#usage)\n  * [As a web service](#as-a-web-service)\n  * [As a RESTful API](#as-a-restful-api)\n  * [One name-string](#one-name-string)\n  * [Many name-strings in a file](#many-name-strings-in-a-file)\n  * [Advanced search](#advanced-search)\n  * [Options and flags](#options-and-flags)\n    * [help](#help)\n    * [version](#version)\n    * [port](#port)\n    * [all_matches](#all_matches)\n    * [capitalize](#capitalize)\n    * [species group](#species-group)\n    * [relaxed fuzzy-match](#relaxed-fuzzy-match)\n    * [fuzzy-match of uninomial names](#fuzzy-match-of-uninomial-names)\n    * [vernaculars](#vernaculars)\n    * [format](#format)\n    * [jobs](#jobs)\n    * [quiet](#quiet)\n    * [sources](#sources)\n  * [Configuration file](#configuration-file)\n  * [Advanced Search Query Language](#advanced-search-query-language)\n    * [Examples of searches](#examples-of-searches)\n* [Copyright](#copyright)\n\n\u003c!-- vim-markdown-toc --\u003e\n\n## Citing\n\nIf you want to cite GNverifier, use [DOI generated by Zenodo][zenodo doi]:\n\n## Features\n\n- Small and fast app to verify scientific names against many biodiversity\n  databases. The app is a client to a [verifier API].\n- It provides different match levels:\n  - **Exact**: complete match with a canonical form or a full name-string\n    from a data source.\n  - **Fuzzy**: if exact match did not happen, it tries to match name-strings\n    assuming spelling errors.\n  - **FuzzyRelaxed**: if exact match did not happen, it tries to match\n    name-strings using 'relaxed' fuzzy-matching rules.\n  - **Partial**: strips middle or last epithets from bi- or multi-nomial names\n    and tries to match what is left.\n  - **PartialFuzzy**: the same as Partial but assuming spelling mistakes.\n  - **PartialFuzzyRelaxed**: the same as PartialFuzzy but with relaxed\n    fuzzy-matchng rules\n  - **Virus**: verification of virus names.\n  - **FacetedSearch**: marks [advanced-search](#advanced-search) queries.\n- Fuzzy matching that tries to balance number of false positives and false\n  negatives (more information on: [fuzzy-matching]).\n- Taxonomic resolution. If a database contains taxonomic information, it\n  returns the currently accepted name for the provided name-string.\n- Best match is returned according to the match score. Data sources with some\n  manual curation have priority over auto-curated and uncurated datasets. For\n  example [Catalogue of Life] or [WoRMS] are considered curated,\n  [GBIF] auto-curated, [uBio] not curated.\n- Fine-tuning the match score by matching authors, years, ranks etc.\n- It is possible to map any name-strings checklist to any of registered\n  Data Sources.\n- If a Data Source provides a classification for a name, it will be returned to\n  the output.\n- The app works for checking just one name-string, or multiple ones written in\n  a file.\n- [Advanced search](#advanced-search) uses simple but powerful\n  [query language](#advanced-search-query-language)\n  to find abbreviated names, search by author, year etc.\n- Supports feeding data via pipes of an operating system. This feature allows\n  to chain the program together with other tools.\n- [GNverifier] includes a web-based graphical user interface identical to its\n  \"official\" [web-service].\n\n## Installation\n\n### Using Homebrew on Mac OS X, Linux, and Linux on Windows ([WSL2])\n\nHomebrew is a popular package manager for Open Source software originally\ndeveloped for Mac OS X. Now it is also available on Linux, and can easily\nbe used on Windows 10 or 11, if Windows Subsystem for Linux (WSL) is\n[installed][wsl install].\n\nTo use [GNverifier] with Homebrew:\n\n1. Install [Homebrew]\n\n2. Open terminal and run the following commands:\n\n```bash\nbrew tap gnames/gn\nbrew install gnverifier\n```\n\n### MS Windows\n\nDownload the [latest release] from GitHub, unzip.\n\nOne possible way would be to create a default folder for executables and place\n`GNverifier` there.\n\nUse `Windows+R` keys\ncombination and type \"`cmd`\". In the appeared terminal window type:\n\n```cmd\nmkdir C:\\Users\\your_username\\bin\ncopy path_to\\gnverifier.exe C:\\Users\\your_username\\bin\n```\n\n[Add `C:\\Users\\your_username\\bin` directory to your `PATH`][winpath] `user`\nand/or `system` environment variable.\n\nAnother, simpler way, would be to use `cd C:\\Users\\your_username\\bin` command\nin `cmd` terminal window. The [GNverifier] program then will be automatically\nfound by Windows operating system when you run its commands from that\ndirectory.\n\nYou can also read a more detailed guide for Windows users in\n[a PDF document][win-pdf].\n\n### Linux and Mac (without Homebrew)\n\nIf [Homebrew] is not installed, download the [latest release] from GitHub,\nuntar, and install binary somewhere in your path.\n\n```bash\ntar xvf gnverifier-linux-0.1.0.tar.xz\n# or tar xvf gnverifier-mac-0.1.0.tar.gz\nsudo mv gnverifier /usr/local/bin\n```\n\n### Compile from source\n\nInstall Go according to [installation instructions][go-install]\n\n```bash\ngo get github.com/gnames/gnverifier/gnverifier\n```\n\n## Usage\n\n[GNverifier] takes one name-string or a text file with one name-string per\nline as an argument, sends a query with these data to a [remote GNames\nserver][gnames] to match the name-strings against many biodiversity\ndatabases and returns results to STDOUT either in JSON, CSV or TSV format.\n\nThe app can alto take a query string like\n`g:M. sp:galloprovincialis au:Olivier` to perform advanced searching,\nif the full scientific name is undetermined.\n\n### As a web service\n\n```bash\ngnverifier -p 8080\n```\n\nAfter running this command, you should be able to access web-based user\ninterface via a browser at `http://localhost:8080`\n\n### As a RESTful API\n\nRefer to the [RESTful API docs][gnames] to learn how to use the same\nfunctionality via scripts.\n\n### One name-string\n\n```bash\ngnverifier \"Monohamus galloprovincialis\"\n```\n\n### Many name-strings in a file\n\n```bash\ngnverifier /path/to/names.txt\n```\n\nThe app assumes that a file contains a simple list of names, one per line.\n\nIt is also possible to feed data via STDIN:\n\n```bash\ncat /path/to/names.txt | gnverifier\n```\n\n### Advanced search\n\nAdvanced search allows to use a simple but powerful query language to find names\nby abbreviated genus, a year or a range of years. See detailed description\nin [Advanced Search Query Language](#advanced-search-query-language) section.\n\n```bash\ngnverifier \"g:B. sp:bubo au:Linn. y:1700-\"\n```\n\n### Options and flags\n\nAccording to POSIX standard flags and options can be given either before or\nafter name-string or file name.\n\n#### help\n\n```bash\ngnverifier -h\n# or\ngnverifier --help\n# or\ngnverifier\n```\n\n#### version\n\n```bash\ngnverifier -V\n# or\ngnverifier --version\n```\n\n#### port\n\nStarts GNverifier as a web service using entered port\n\n```bash\ngnverifier -p 8080\n```\n\nThis command will run user-interface accessible by a browser\nat `http://localhost:8080`\n\n#### all_matches\n\nTo see all matches instead of the best one use --all_matches flag.\n\nWARNING: for some names the result will be excessively large.\n\n```bash\ngnverifier -s '1,12' -M file.txt\ngnverifier --all_matches \"Pardosa moesta\"\n```\n\nThis flag is ignored by advanced search.\n\n#### capitalize\n\nIf your names are co not have uninomials or genera capitalized according to\nrules on nomenclature, you can still verify them using this option. If\n`capitalize` flag is set, the first character of every name-string will be\ncapitalized (when appropriate). This flag is ignores by advanced search.\n\n```bash\ngnverifier -c \"bubo bubo\"\n# or\ngnverifier --capitalize \"bubo bubo\"\n```\n\n#### species group\n\nIf `species_group` flag is on, a search of `Aus bus` would also search for\n`Aus bus bus` and vice versa. This flag expands search to a species group of\na name if applicable. It means it involves into search botanical autonyms and\ncoordinated names in zoology.\n\n```bash\ngnverifier -G \"Bubo bubo\"\ngnverifier  --species_group \"Bubo bubo\"\n```\n\n#### relaxed fuzzy-match\n\nRelaxes fuzzy-matching rules, allowing fuzzy match for words of any size, and\nincreasing maximum edit distance (for stems) to two. This creates many more\nfalse positives, but increases recall. It is recommended to check results by\nhand if this feature is enabled. The maximum number of names allowed when this\noption is enabled is 50.\n\n```bash\ngnverifier -R \"Bbo bbo\"\ngnverifier --fuzzy_relaxed \"Bbo bbo\"\n```\n\n#### fuzzy-match of uninomial names\n\nWhen `fuzzy_uninomial` flag is on, uninomials are allowed to go through\nfuzzy matching, if needed. Normally this flag is off because fuzzy-matched\nuninomials create a significant amount of false positives.\n\n```bash\ngnverifier -U \"Pomatmus\"\ngnverifier --fuzzy_uninomial \"Pomatmus\"\n```\n\n#### vernaculars\n\nSets languages for augmenting search results with vernacular names from the\nrequested data sources. Try it with iNaturalist (id 180).\n\nThe languages has to be given in 3-letter ISO 639-3 code, separated by\na comma (e.g., eng,deu,rus,fra). If 'all' is given instead, vernacular\nnames from all languages will be returned.\n\nIf this option is enabled the input will be limited to 50 scientific names.\n\n```bash\ngnverifier -r eng,fra -s 180 \"Bubo bubo\"\ngnverifier --vernaculars=all -s 180 \"Bubo bubo\"\n```\n\n#### format\n\nAllows to pick a format for output. Supported formats are\n\n- compact: one-liner JSON.\n- pretty: prettified JSON with new lines and tabs for easier reading.\n- tsv: returns tab-separated values representation.\n- csv: (DEFAULT) returns comma-separated values representation.\n\n```bash\n# short form for compact JSON format\ngnverifier -f compact file.txt\n# or long form for \"pretty\" JSON format\ngnverifier --format=\"pretty\" file.csv\n# tsv format\ngnverifier -f tsv file.csv\n```\n\nNote that a separate JSON \"document\" is returned for each separate record,\ninstead of returning one big JSON document for all records. For large lists it\nsignificantly speeds up parsing of the JSON on the user side.\n\n#### jobs\n\nIf the list of names if very large, it is possible to tell [GNverifier] to\nrun requests in parallel. In this example GNverifier will run 8 processes\nsimultaneously. The order of returned names will be somewhat randomized.\n\n```bash\ngnverifier -j 8 file.txt\n# or\ngnverifier --jobs=8 file.tsv\n```\n\nSometimes it is important to return names in exactly same order. For such\ncases set `jobs` flag to 1.\n\n```bash\ngnverifier -j 1 file.txt\n```\n\nThis option is ignored by advanced search.\n\n#### quiet\n\nRemoves log messages from the output. Note that results of verification go\nto STDOUT, while log messages go to STDERR. So instead of using `-q` flag\nSTDERR can be redirected to `/dev/null`:\n\n```bash\ngnverifier \"Puma concolor\" -q \u003everif-results.csv\n\n#or\n\ngnverifier \"Puma concolor 2\u003e/dev/null \u003everif-results.csv\n```\n\n#### sources\n\nBy default [GNverifier] returns only one \"best\" result of a match. If a user\nhas a particular interest in a data set, s/he can set it with this option, and\nall matches that exist for this source will be returned as well. You need to\nprovide a data source id for a dataset. Ids can be found at the following\n[URL][data_source_ids]. Some of them are provided in the GNverifier help\noutput as well.\n\nData from such sources will be returned in preferred_results section of JSON\noutput, or with CSV/TSV rows that start with \"PreferredMatch\" string.\n\n```bash\ngnverifier file.csv -s \"1,11,172\"\n# or\ngnverifier file.tsv --sources=\"12\"\n# or\ncat file.txt | gnverifier -s '1,12'\n```\n\nIf all matched sources need to be returned, set the flag to \"0\".\n\nWARNING: the result might be excessively large.\n\n```bash\ngnverifier \"Bubo bubo\" -s 0\n# potentially even more results get returned by adding --all_matches flag\ngnverifier \"Bubo bubo\" -s 0 -M\n```\n\nThe `sources` option would overwrite `ds:` settings in case of advanced search.\n\n### Configuration file\n\nIf you find yourself using the same flags over and over again, it makes sense\nto edit configuration file instead. It is located at\n`$HOME/.config/gnverifier.yaml`. After that you do not need to use command line\noptions and flags. Configuration file is self-documented, the [default\ngnverifier.yaml] is located on GitHub\n\n```bash\ngnverifier file.txt\n```\n\nIn case if [GNverifier] runs as a web-based user interface, it is also\npossible to use environment variables for configuration.\n\n| Env. Var.               | Configuration      |\n| :---------------------- | :----------------- |\n| GNV_FORMAT              | Format             |\n| GNV_DATA_SOURCES        | DataSources        |\n| GNV_WITH_ALL_MATCHES    | WithAllMatches     |\n| GNV_WITH_CAPITALIZATION | WithCapitalization |\n| GNV_VERIFIER_URL        | VerifierURL        |\n| GNV_JOBS                | Jobs               |\n\n### Advanced Search Query Language\n\nExample: `g:M. sp:gallop. au:Oliv. y:1750-1799` or `n:M. gallop. Oliv. 1750-1799`\n\nQuery language allows searching for scientific names using name components\nlike genus name, specific epithet, infraspecific epithet, author, year.\nIt includes following operators:\n\n`g:`\n: Genus name, can be abbreviated (for example `g:Bubo`, `g:B.`).\n\n`sp:`\n: specific epithet, can be abbreviated (for example `sp:galloprovincialis`,\n`sp:gallop.`).\n\n`isp:`\n: Infraspecific epithet, can be abbreviated (for example `isp:auspicalis`,\n`isp:ausp.`).\n\n`asp:`\n: Either specific, or infraspecific epithet (for example `asp:bubo`).\n\n`au:`\n: One of the authors of a name, can be abbreviated (for example `au:Linn.`,\n`au:Linnaeus`).\n\n`y:`\n: Year. Can be one year, or a year range (for example `y:1888`, `y:1800-1802`,\n`y:1756-`, `y:-1880`)\n\n`ds:`\n: Limit result to one or more data-sources. Note that command line `sources`\noption, if given, will overwrite this setting (`ds:1,2,172`).\n\n`tx:`\n: Parent taxon. Limit results to names that contain a particular higher taxon\nin their classification. If `ds:` is given, uses the classification of the\nfirst data-source in the setting. If `ds:` is not given, uses managerial\nclassification of the Catalogue of Life (`tx:Hemiptera`, `tx:Animalia`,\n`tx:Magnoliopsida`).\n\n`all:`\n: If true, [GNverifier] will show all results, not only the best ones.\nThe setting can be `true` or `false` (`all:t`, `all:f`). This setting\nwill also become true if `sources` command line option is set to `0`.\n\n`n:`\n: A \"name\" setting. It allows to combine several query components together\nfor convenience. Note that it is not a 'real' scientific name, but a shortcut\nto enter several settings at once loosely following rules of nomenclature\n(`n:B. bubo Linn. 1758`). For example, in contrast with GNparser results, it\nis possible to have abbreviated specific epithets or range in\nyears: `n:Mono. gall. Oliv. 1750-1800`.\n\nOften there are errors in species epithets gender. Because of that search\nwill try to detect names in any gender that correspond to the epithet.\n\nThe search requires to have either `sp:`, `isp:` or `asp:` setting,\nor provide their analogs in `n:` setting.\n\n#### Examples of searches\n\n```text\ngnverifier \"n:Pom. saltator tx:Animalia y:1750-\"\n\ngnverifier \"g:Plantago asp:major au:Linn.\"\n\ngnverifier \"g:Cara. isp:daurica ds:1,12\"\n```\n\n## Copyright\n\nAuthors: [Dmitry Mozzherin][dimus]\n\nCopyright © 2020-2024 Dmitry Mozzherin. See [LICENSE] for further\ndetails.\n\n[Feedback]: https://github.com/gnames/gnverifier/issues\n[GNverifier API]: https://apidoc.globalnames.org/gnames\n[GNverifier with OpenRefine]: https://github.com/gnames/gnverifier/wiki/OpenRefine-readme\n[catalogue of life]: https://catalogueoflife.org/\n[data_source_ids]: https://verifier.globalnames.org/data_sources\n[default gnverifier.yaml]: https://github.com/gnames/gnverifier/blob/master/gnverifier/cmd/gnverifier.yaml\n[dimus]: https://github.com/dimus\n[fuzzy-matching]: https://github.com/gnames/gnverifier/blob/master/fuzzy-matching.md\n[gbif]: https://www.gbif.org/\n[gnames]: https://apidoc.globalnames.org/gnames\n[gnverifier]: https://github.com/gnames/gnverifier\n[go-install]: https://golang.org/doc/install\n[homebrew]: https://brew.sh/\n[latest release]: https://github.com/gnames/gnverifier/releases/latest\n[license]: https://github.com/gnames/gnverifier/blob/master/LICENSE\n[test directory]: https://github.com/gnames/gnverifier/tree/master/testdata\n[ubio]: https://ubio.org/\n[verifier api]: https://apidoc.globalnames.org/gnames\n[web-service]: https://verifier.globalnames.org\n[win-pdf]: https://github.com/gnames/gnverifier/blob/master/use-gnverifier-windows.pdf\n[winpath]: https://www.computerhope.com/issues/ch000549.htm\n[worms]: https://marinespecies.org/\n[wsl install]: https://docs.microsoft.com/en-us/windows/wsl/install-win10\n[wsl2]: https://docs.microsoft.com/en-us/windows/wsl/install\n[zenodo doi]: https://zenodo.org/badge/latestdoi/297323648\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgnames%2Fgnverifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgnames%2Fgnverifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgnames%2Fgnverifier/lists"}