https://github.com/scai-bio/adhteb

ADHTEB - Alzheimers Disease Harmonization Text Embedding Benchmark
https://github.com/scai-bio/adhteb

alzheimers-disease benchmark data-harmonization llms

Last synced: 6 months ago
JSON representation

ADHTEB - Alzheimers Disease Harmonization Text Embedding Benchmark

Host: GitHub
URL: https://github.com/scai-bio/adhteb
Owner: SCAI-BIO
License: apache-2.0
Created: 2025-05-21T09:02:45.000Z (9 months ago)
Default Branch: main
Last Pushed: 2025-07-28T14:40:29.000Z (7 months ago)
Last Synced: 2025-07-28T16:28:59.555Z (7 months ago)
Topics: alzheimers-disease, benchmark, data-harmonization, llms
Language: Python
Homepage: https://adhteb.scai.fraunhofer.de
Size: 26 MB
Stars: 1
Watchers: 3
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Alzheimer's Disease Harmonization Text Embedding Benchmark

[![DOI](https://zenodo.org/badge/987565118.svg)](https://doi.org/10.5281/zenodo.16027340)

![tests](https://github.com/SCAI-BIO/ADHTEB/actions/workflows/tests.yaml/badge.svg)

# Installation

```bash

pip install adhteb

```

# Usage

## Import a model

Models that are published on huggingface can be directly imported using the HuggingFaceVectorizer class.

```python

from adhteb import HuggingFaceVectorizer

vectorizer = HuggingFaceVectorizer(

    model_name="sentence-transformers/all-MiniLM-L6-v2",

)

```

Alternatively, you can implement your own vectorizer by implementing the `get_embedding` method of the base class.

```python

from adhteb import Vectorizer

class MyVectorizer(Vectorizer):

    def get_embedding(self, text: str) -> list[float]:

        # Implement your embedding logic here

        my_vector = []

        return my_vector

```

## Running the benchmark

You can run the benchmark and display the results using only a few lines of code.

```python

from adhteb import Benchmark

    benchmark = Benchmark(vectorizer=vectorizer)

    benchmark.run()

    print(benchmark.results_summary())

```

```commandline

+------------------+-------+--------------------+

|                  | AUPRC | Zero-shot Accuracy |

+------------------+-------+--------------------+

|      GERAS       | 0.35  |        0.65        |

| PREVENT Dementia | 0.19  |        0.48        |

|    PREVENT AD    | 0.22  |        0.39        |

|       EMIF       | 0.29  |        0.54        |

+------------------+-------+--------------------+

Aggregate Score: 0.39

```

## Publishing your results

You can check how your results compare to other models on the public leaderboard here:

[https://adhteb.scai.fraunhofer.de](https://adhteb.scai.fraunhofer.de)

You are also able to publish your benchmark results together with metadata on yout tested model:

```python

from adhteb import Benchmark, ModelMetadata

model_name= "my-model-name"

url="https://huggingface.co/my-model-name"

model_metadata = ModelMetadata(model_name=model_name, url=url)

benchmark.publish(model_metadata=model_metadata)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/scai-bio/adhteb

Awesome Lists containing this project

README