https://github.com/berntpopp/phentrieve
AI-powered system for mapping clinical text to Human Phenotype Ontology (HPO) terms using Retrieval-Augmented Generation (RAG). Features Python CLI/library, FastAPI backend, and Vue.js frontend for interactive phenotype extraction from medical texts.
https://github.com/berntpopp/phentrieve
artificial-intelligence biomedical-informatics clinical-text fastapi healthcare hpo machine-learning medical-nlp phenotype-ontology python rag retrieval-augmented-generation semantic-search text-mining vuejs
Last synced: 10 days ago
JSON representation
AI-powered system for mapping clinical text to Human Phenotype Ontology (HPO) terms using Retrieval-Augmented Generation (RAG). Features Python CLI/library, FastAPI backend, and Vue.js frontend for interactive phenotype extraction from medical texts.
- Host: GitHub
- URL: https://github.com/berntpopp/phentrieve
- Owner: berntpopp
- License: mit
- Created: 2025-04-24T14:13:09.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2026-06-12T22:56:30.000Z (10 days ago)
- Last Synced: 2026-06-12T23:10:06.100Z (10 days ago)
- Topics: artificial-intelligence, biomedical-informatics, clinical-text, fastapi, healthcare, hpo, machine-learning, medical-nlp, phenotype-ontology, python, rag, retrieval-augmented-generation, semantic-search, text-mining, vuejs
- Language: Python
- Homepage: https://phentrieve.kidney-genetics.org/
- Size: 15.6 MB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Security: SECURITY.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
# Phentrieve

Phentrieve is an advanced AI-powered research system for mapping phenotype descriptions to Human Phenotype Ontology (HPO) terms using a Retrieval-Augmented Generation (RAG) approach. It supports multiple languages and offers robust tools for benchmarking, text processing, and HPO term retrieval.
**Research use only:** Phentrieve is not a medical device and must not be used for diagnosis, treatment selection, patient triage, or other clinical decision-making. See the [Research Use Only guide](docs/compliance/research-use.md) and [Privacy and LLM Processing](docs/compliance/privacy-and-llm-processing.md).
**For comprehensive documentation, please visit the [Phentrieve Documentation Site](https://berntpopp.github.io/phentrieve/).**
## Key Features
* Multilingual HPO term mapping using state-of-the-art embedding models
* Advanced text processing pipeline including semantic chunking and assertion detection
* Optional adaptive re-chunking improves recall on multi-concept clinical sentences (`--adaptive-rechunking`). See [docs/user-guide/adaptive-rechunking.md](docs/user-guide/adaptive-rechunking.md).
* Extensive benchmarking framework for model evaluation and comparison
* User-friendly interfaces: CLI, FastAPI backend, and Vue.js frontend
## Benchmark Results
Performance on 570 German clinical terms (BioLORD-2023-M model):
| Retrieval Mode | MRR | Hit@1 | Hit@10 | Ont Sim@1 |
|----------------|-----|-------|--------|-----------|
| Single-vector | 0.695 | 55.8% | 94.0% | 79.9% |
| Multi-vector (all_max) | **0.892** | **84.0%** | **97.4%** | **91.9%** |
**+28% MRR improvement** with multi-vector retrieval using label, synonym, and definition embeddings.
## Quick Start
Install Phentrieve using pip:
```bash
pip install phentrieve
```
For detailed setup and usage instructions, including Docker deployment, please see our [Getting Started Guide](https://berntpopp.github.io/phentrieve/getting-started/installation/).
## Basic Usage
```bash
# Launch interactive query mode
phentrieve query --interactive
# Process research text to extract HPO terms
phentrieve text process "The research note mentions microcephaly and frequent seizures."
```
Discover more commands and options in the [User Guide](https://berntpopp.github.io/phentrieve/user-guide/).
## Configuration
### Configuration profiles
Define named profiles in `phentrieve.yaml` to preset CLI options:
```yaml
profiles:
fast_query:
command: query
num_results: 5
similarity_threshold: 0.5
```
Then `phentrieve query "TEXT" --profile fast_query`.
See [docs/user-guide/configuration-profiles.md](docs/user-guide/configuration-profiles.md) for the full guide.
## Docker Deployment
Deploy Phentrieve using Docker Compose for self-hosted research environments:
```bash
# Linux: Setup volume permissions (required)
sudo ./scripts/setup-docker-volumes.sh
# macOS/Windows: No setup needed, skip to next step
# Start services
docker-compose up -d
# Access the application
# - API: http://localhost:8000
# - Frontend: http://localhost:8080
```
For detailed deployment instructions, security best practices, and troubleshooting, see the [Docker Deployment Guide](docs/DOCKER-DEPLOYMENT.md).
---
[Full Documentation](https://berntpopp.github.io/phentrieve/) | [Contributing Guide](https://berntpopp.github.io/phentrieve/development/) | [License](LICENSE)