Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jianlins/vbal4sampling

Enhancing Active Learning for Annotation Sampling over Large-Scale Corpus Via Vector-Based Indexing. A Placeholder Repository for AMIA 2024 Submission. The code will be released upon acceptance. This is a place holder repository for AMIA 2024 submission. Code will be released after acceptance.
https://github.com/jianlins/vbal4sampling

Last synced: 9 days ago
JSON representation

Enhancing Active Learning for Annotation Sampling over Large-Scale Corpus Via Vector-Based Indexing. A Placeholder Repository for AMIA 2024 Submission. The code will be released upon acceptance. This is a place holder repository for AMIA 2024 submission. Code will be released after acceptance.

Awesome Lists containing this project

README

        

# VBAL4Sampling
Enhancing Active Learning for Annotation Sampling over Large-Scale Corpus Via Vector-Based Indexing. Our poster manuscript has been accepted [@AMIA 2024](https://amia.org/education-events/amia-2024-annual-symposium)

inside the notebooks folder:
* [gpu01_CleanupBratAnnotation.ipynb](notebooks%2Fgpu01_CleanupBratAnnotation.ipynb) is used to clean the original annotations and build vector indexes.
* [dev19_stats_sentence_sampling.ipynb](notebooks%2Fdev19_stats_sentence_sampling.ipynb) is the stats comparison among different approaches' outputs, which were pickled in the data folder.
* The other notebooks are different approaches to apply the bootstrap method to generate outputs.
* [dev19_stats_sentence_sampling.html](notebooks%2Farchive%2Fdev19_stats_sentence_sampling.html) is the exported html format of [dev19_stats_sentence_sampling.ipynb](notebooks%2Fdev19_stats_sentence_sampling.ipynb)