{"id":43223286,"url":"https://github.com/gnames/ds-ruhoff-mollusca","last_synced_at":"2026-02-01T09:15:55.391Z","repository":{"id":175946839,"uuid":"654533994","full_name":"gnames/ds-ruhoff-mollusca","owner":"gnames","description":"Data-Source out of publication \"Index to the species of Mollusca introduced from 1850 to 1870\" Ruhoff, Florence A. 1980","archived":false,"fork":false,"pushed_at":"2024-12-02T21:35:18.000Z","size":15643,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-09-05T00:09:04.934Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gnames.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-06-16T10:40:33.000Z","updated_at":"2024-12-02T21:35:23.000Z","dependencies_parsed_at":"2024-12-02T22:34:25.798Z","dependency_job_id":null,"html_url":"https://github.com/gnames/ds-ruhoff-mollusca","commit_stats":null,"previous_names":["gnames/ds-ruhoff-mollusca"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/gnames/ds-ruhoff-mollusca","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fds-ruhoff-mollusca","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fds-ruhoff-mollusca/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fds-ruhoff-mollusca/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fds-ruhoff-mollusca/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gnames","download_url":"https://codeload.github.com/gnames/ds-ruhoff-mollusca/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gnames%2Fds-ruhoff-mollusca/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28974537,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-01T08:16:14.655Z","status":"ssl_error","status_checked_at":"2026-02-01T08:06:51.373Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-01T09:15:54.723Z","updated_at":"2026-02-01T09:15:55.384Z","avatar_url":"https://github.com/gnames.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Scientific names extracted from Ruhoff 1980\n\n[![DOI](https://zenodo.org/badge/654533994.svg)](https://doi.org/10.5281/zenodo.14262688)\n\nThe goal of this project is to extract scientific names from\n[Ruhoff 1980](https://doi.org/10.5479/si.00810282.294)\n\nCleaned up data [file](data/08-reconcile.csv)\n\n## Process\n\n- [x] Make [OCR](data/01-ocr.txt)\n\n- [x] [Concatenate lines](data/03-concat.txt)\n\n- [x] [Fix spaces](data/04-sortfix.txt) in species names\n\n- [x] [Fix commas](data/04-sortfix.txt) which were recognized as periods.\n\n- [x] [Fix years](data/05-year.txt)\n\n- [x] [Extract name part](data/06-names.csv) (06-names.csv first column is the place to fix errors)\n\n- [x] [Reformat name part](data/07-fmt-names.csv)\n\n- [x] [Fix spellings in names](data/07-fmt-names.csv)\n\n- [x] [Run reconciliation using GNverifier with OpenRefine](data/08-reconcile.csv)\n\n## Stats\n\n| Names                   | Number | Percentage |\n| ----------------------- | ------ | ---------- |\n| Total                   | 35487  | 100%       |\n| All Matches             | 26799  | 75.4%      |\n| No Match                | 8688   | 24.6%      |\n| Canonical + Auth. Match | 22311  | 62.8%      |\n| Canonical Match         | 3448   | 9.7%       |\n| Fuzzy Canonical Match   | 1040   | 2.9%       |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgnames%2Fds-ruhoff-mollusca","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgnames%2Fds-ruhoff-mollusca","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgnames%2Fds-ruhoff-mollusca/lists"}