{"id":46922441,"url":"https://github.com/berntpopp/phentrieve","last_synced_at":"2026-06-13T02:01:07.601Z","repository":{"id":292991132,"uuid":"972087828","full_name":"berntpopp/phentrieve","owner":"berntpopp","description":"AI-powered system for mapping clinical text to Human Phenotype Ontology (HPO) terms using Retrieval-Augmented Generation (RAG). Features Python CLI/library, FastAPI backend, and Vue.js frontend for interactive phenotype extraction from medical texts.","archived":false,"fork":false,"pushed_at":"2026-06-12T22:56:30.000Z","size":16401,"stargazers_count":6,"open_issues_count":11,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-06-12T23:10:06.100Z","etag":null,"topics":["artificial-intelligence","biomedical-informatics","clinical-text","fastapi","healthcare","hpo","machine-learning","medical-nlp","phenotype-ontology","python","rag","retrieval-augmented-generation","semantic-search","text-mining","vuejs"],"latest_commit_sha":null,"homepage":"https://phentrieve.kidney-genetics.org/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/berntpopp.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-04-24T14:13:09.000Z","updated_at":"2026-06-12T22:56:34.000Z","dependencies_parsed_at":"2025-05-27T07:36:22.230Z","dependency_job_id":null,"html_url":"https://github.com/berntpopp/phentrieve","commit_stats":null,"previous_names":["berntpopp/phentrieve"],"tags_count":30,"template":false,"template_full_name":null,"purl":"pkg:github/berntpopp/phentrieve","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/berntpopp%2Fphentrieve","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/berntpopp%2Fphentrieve/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/berntpopp%2Fphentrieve/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/berntpopp%2Fphentrieve/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/berntpopp","download_url":"https://codeload.github.com/berntpopp/phentrieve/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/berntpopp%2Fphentrieve/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34269364,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-13T02:00:06.617Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","biomedical-informatics","clinical-text","fastapi","healthcare","hpo","machine-learning","medical-nlp","phenotype-ontology","python","rag","retrieval-augmented-generation","semantic-search","text-mining","vuejs"],"created_at":"2026-03-11T03:02:22.067Z","updated_at":"2026-06-13T02:01:07.588Z","avatar_url":"https://github.com/berntpopp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Phentrieve\n\n![Phentrieve Logo](docs/assets/phentrieve-logo.svg)\n\nPhentrieve is an advanced AI-powered research system for mapping phenotype descriptions to Human Phenotype Ontology (HPO) terms using a Retrieval-Augmented Generation (RAG) approach. It supports multiple languages and offers robust tools for benchmarking, text processing, and HPO term retrieval.\n\n**Research use only:** Phentrieve is not a medical device and must not be used for diagnosis, treatment selection, patient triage, or other clinical decision-making. See the [Research Use Only guide](docs/compliance/research-use.md) and [Privacy and LLM Processing](docs/compliance/privacy-and-llm-processing.md).\n\n**For comprehensive documentation, please visit the [Phentrieve Documentation Site](https://berntpopp.github.io/phentrieve/).**\n\n## Key Features\n\n* Multilingual HPO term mapping using state-of-the-art embedding models\n* Advanced text processing pipeline including semantic chunking and assertion detection\n* Optional adaptive re-chunking improves recall on multi-concept clinical sentences (`--adaptive-rechunking`). See [docs/user-guide/adaptive-rechunking.md](docs/user-guide/adaptive-rechunking.md).\n* Extensive benchmarking framework for model evaluation and comparison\n* User-friendly interfaces: CLI, FastAPI backend, and Vue.js frontend\n\n## Benchmark Results\n\nPerformance on 570 German clinical terms (BioLORD-2023-M model):\n\n| Retrieval Mode | MRR | Hit@1 | Hit@10 | Ont Sim@1 |\n|----------------|-----|-------|--------|-----------|\n| Single-vector | 0.695 | 55.8% | 94.0% | 79.9% |\n| Multi-vector (all_max) | **0.892** | **84.0%** | **97.4%** | **91.9%** |\n\n**+28% MRR improvement** with multi-vector retrieval using label, synonym, and definition embeddings.\n\n## Quick Start\n\nInstall Phentrieve using pip:\n\n```bash\npip install phentrieve\n```\n\nFor detailed setup and usage instructions, including Docker deployment, please see our [Getting Started Guide](https://berntpopp.github.io/phentrieve/getting-started/installation/).\n\n## Basic Usage\n\n```bash\n# Launch interactive query mode\nphentrieve query --interactive\n\n# Process research text to extract HPO terms\nphentrieve text process \"The research note mentions microcephaly and frequent seizures.\"\n```\n\nDiscover more commands and options in the [User Guide](https://berntpopp.github.io/phentrieve/user-guide/).\n\n## Configuration\n\n### Configuration profiles\n\nDefine named profiles in `phentrieve.yaml` to preset CLI options:\n\n```yaml\nprofiles:\n  fast_query:\n    command: query\n    num_results: 5\n    similarity_threshold: 0.5\n```\n\nThen `phentrieve query \"TEXT\" --profile fast_query`.\n\nSee [docs/user-guide/configuration-profiles.md](docs/user-guide/configuration-profiles.md) for the full guide.\n\n## Docker Deployment\n\nDeploy Phentrieve using Docker Compose for self-hosted research environments:\n\n```bash\n# Linux: Setup volume permissions (required)\nsudo ./scripts/setup-docker-volumes.sh\n\n# macOS/Windows: No setup needed, skip to next step\n\n# Start services\ndocker-compose up -d\n\n# Access the application\n# - API: http://localhost:8000\n# - Frontend: http://localhost:8080\n```\n\nFor detailed deployment instructions, security best practices, and troubleshooting, see the [Docker Deployment Guide](docs/DOCKER-DEPLOYMENT.md).\n\n---\n\n[Full Documentation](https://berntpopp.github.io/phentrieve/) | [Contributing Guide](https://berntpopp.github.io/phentrieve/development/) | [License](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fberntpopp%2Fphentrieve","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fberntpopp%2Fphentrieve","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fberntpopp%2Fphentrieve/lists"}