{"id":50379269,"url":"https://github.com/pisong314/snoextract","last_synced_at":"2026-05-30T11:01:21.871Z","repository":{"id":358425191,"uuid":"1241325982","full_name":"pisong314/snoextract","owner":"pisong314","description":"Offline  clinical free text --\u003e structured SNOMED engine","archived":false,"fork":false,"pushed_at":"2026-05-28T21:02:22.000Z","size":61,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-28T23:07:49.698Z","etag":null,"topics":["entity-linking","snomed","snomed-ct","snomed-ct-au"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pisong314.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-17T08:37:59.000Z","updated_at":"2026-05-28T21:02:26.000Z","dependencies_parsed_at":"2026-05-28T23:03:33.815Z","dependency_job_id":null,"html_url":"https://github.com/pisong314/snoextract","commit_stats":null,"previous_names":["pisong314/snoextract"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/pisong314/snoextract","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pisong314%2Fsnoextract","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pisong314%2Fsnoextract/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pisong314%2Fsnoextract/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pisong314%2Fsnoextract/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pisong314","download_url":"https://codeload.github.com/pisong314/snoextract/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pisong314%2Fsnoextract/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33689564,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-30T02:00:06.278Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["entity-linking","snomed","snomed-ct","snomed-ct-au"],"created_at":"2026-05-30T11:01:21.224Z","updated_at":"2026-05-30T11:01:21.858Z","avatar_url":"https://github.com/pisong314.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# SNOExtract\n\nLightweight offline SNOMED CT clinical concept extraction.\n\n- Deterministic\n- No outbound network calls — patient notes never leave the host\n- CPU-only, 128 MB memory requirement \n- REST / gRPC / CLI / Python\n- Runs on-premise or at the point of care (clinician in the loop review encouraged)\n\nExtracts SNOMED concepts (CUIs, semantic types, negation/uncertainty/historicity context) from clinical free-text. Ships as a self-contained binary with data files — no Python install, no database, no internet calls at runtime.\n\n![SNOExtract turning a short clinical note into structured SNOMED concepts](images/extraction1.png)\n\n## Try it live (web demo)\n\n**[snomed-ner-demo-874953055038.australia-southeast1.run.app](https://snomed-ner-demo-874953055038.australia-southeast1.run.app/)** — paste a clinical note, see entities in your browser.\n\nDemo runs in the cloud for convenience; production binaries are fully offline. Don't paste real patient data into the demo.\n## SNOMED CT licensing — required before installing\n\nSNOExtract embeds SNOMED CT-AU concept data, so using the binaries requires a current SNOMED CT licence. This is a SNOMED International obligation, not a SNOExtract one.\n\n- **In Australia** — licences are issued at no charge to healthcare organisations and approved researchers via the [National Clinical Terminology Service](https://www.healthterminologies.gov.au/access-clinical-terminology/access-snomed-ct-au/snomed-ct-au-releases/) (NCTS), administered by the Australian Digital Health Agency.\n- **Outside Australia** — apply through your country's National Release Centre, or directly via [SNOMED International](https://www.snomed.org/) for affiliate licensing.\n\nThe dist includes `SNOMED_CT_NOTICE.txt` covering the attribution and end-user obligations that apply to your usage.\n\n## Download\n\nLatest builds for Linux and Windows (x86_64):\n\n**[github.com/pisong314/snoextract/releases/latest](https://github.com/pisong314/snoextract/releases/latest)**\n\n| File | Platform |\n|---|---|\n| `snoextract-\u003cversion\u003e-linux-x86_64.tar.gz` | Linux glibc 2.28+ (RHEL 8 / Ubuntu 20.04+) |\n| `snoextract-\u003cversion\u003e-windows-x86_64.zip`  | Windows 10/11, Server 2019+ |\n\nEach build runs for 90 days, then you download a fresh build with the latest SNOMED CT-AU data. The exact date is printed in the bundled `README.txt`.\n\n## Quickstart\n\nUnzip what you downloaded first then pick the interface that matches how you'll use it.\n\n### 1. Single-call CLI — `\u003c100 ms` per call\n\n`snoextract-json` reads JSON on stdin, writes JSON on stdout. Fresh process per call, ~70–100 ms load-dominated. Best for **ad-hoc use and low-volume integrations** — for bulk work, use [server mode](docs/rest.md) (6× faster per note).\n\nRun from the unzipped dist directory so `./data` is auto-discovered, or point at it explicitly with `export SNOEXTRACT_DATA_DIR=/path/to/dist/data` (or `--data-dir`).\n\n| | Linux / macOS                                                                                       | Windows (cmd)                                                                                  |\n|---|---|---|\n| **Run** | `echo '{\"text\":\"Pt on Metformin 1g BD for diabetes mellitus.\"}' \\| ./snoextract-json` | `echo {\"text\":\"Pt on Metformin 1g BD for diabetes mellitus.\"} \\| snoextract-json.exe` |\n| **Or from file** | `./snoextract-json --input-file in.json --output-file out.json`                          | `snoextract-json.exe --input-file in.json --output-file out.json` |\n\nOutput (truncated):\n\n```json\n{\n  \"version\": \"0.34.7\",\n  \"entities\": [\n    { \"text\": \"Metformin\", \"start\": 6, \"end\": 15, \"cui\": \"372567009\",\n      \"name\": \"Metformin (substance)\", \"semantic_type\": \"substance\", ... },\n    { \"text\": \"diabetes mellitus\", \"start\": 23, \"end\": 40, \"cui\": \"73211009\",\n      \"name\": \"Diabetes mellitus (disorder)\", \"semantic_type\": \"disorder\", ... }\n  ]\n}\n```\n\nFull input/output schema is in the bundled `README.txt`.\n\n### 2. Python (in-process, zero overhead)\n\nPre-built wheel ships under `wheels/`. Requires Python 3.10+.\n\n```bash\npython3 -m pip install wheels/snoextract-0.34.7-cp310-abi3-*.whl\nexport SNOEXTRACT_DATA_DIR=/path/to/dist/data\n```\n\n```python\nfrom snoextract import Pipeline\n\npipeline = Pipeline.load()    # reads SNOEXTRACT_DATA_DIR\nresult = pipeline.process(\"Patient has chest pain and diabetes mellitus.\")\nfor e in result.entities:\n    print(e.text, e.cui, e.name, e.semantic_type)\n```\n\nOne wheel works across CPython 3.10/3.11/3.12/3.13 (abi3).\n\n## More\n\n- **[docs/benchmarks.md](docs/benchmarks.md)** — perf and accuracy numbers\n- **[docs/rest.md](docs/rest.md)** — REST interface: HTTP+JSON extract endpoint\n- **[docs/grpc.md](docs/grpc.md)** — gRPC interface (`.proto` contract, client codegen for Python / Go / Node / C#)\n- **[docs/python.md](docs/python.md)** — Python API: entity attributes, context flags (negation / uncertainty / historicity)\n\n## Reporting issues or terminology gaps\n\nBugs and feature requests → **[Issues](https://github.com/pisong314/snoextract/issues)**.\n\nPlease include the dist version, your OS, and a minimal repro. The issue templates prompt for these.\n\n**Terminology curation is ongoing.** Coverage prioritises Australian clinical contexts (GP notes, discharge summaries, common diagnoses and medications). If a concept you'd expect isn't matched — or is matched to the wrong CUI — open an Issue with the input snippet and the expected CUI. Your reports help improve coverage over time.\n\nMaintained by **Dr Pi Songsiritat MBBS FRACGP** — [piyawoot.song@gmail.com](mailto:piyawoot.song@gmail.com). Questions and feedback welcome.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpisong314%2Fsnoextract","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpisong314%2Fsnoextract","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpisong314%2Fsnoextract/lists"}