https://github.com/stefan-it/germeval-ner-t5

Evaluating German T5 Models on GermEval 2014 (NER)
https://github.com/stefan-it/germeval-ner-t5

Last synced: about 2 months ago
JSON representation

Evaluating German T5 Models on GermEval 2014 (NER)

Host: GitHub
URL: https://github.com/stefan-it/germeval-ner-t5
Owner: stefan-it
Created: 2023-01-28T12:20:52.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-03-14T15:44:19.000Z (about 2 years ago)
Last Synced: 2025-02-08T09:46:54.707Z (4 months ago)
Language: Python
Size: 7.81 KB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Evaluating German T5 Models on GermEval 2014 (NER)

This repository presents an **on-going** evaluation of [German T5 models](https://huggingface.co/GermanT5)

on the [GermEval 2014](https://sites.google.com/site/germeval2014ner/data) NER downstream task.

# Changelog

* 03.02.2023: Initial version.

# Fine-Tuning T5

A few approaches exist for fine-tuning T5 models for token classification tasks:

* ["Structured Prediction as Translation between Augmented Natural Languages"](https://arxiv.org/abs/2101.05779)

* ["Autoregressive Structured Prediction with Language Models"](https://arxiv.org/abs/2210.14698)

These approaches tackle the token classification task as a sequence-to-sequence task.

However, it is also possible to use obly the encoder of a T5 model for downstream tasks as presented in:

* ["EncT5: A Framework for Fine-tuning T5 as Non-autoregressive Models"](https://arxiv.org/abs/2110.08426)

The proposed "EncT5" architecture was not evaluated on token classification tasks.

This repository uses the [Flair library](https://github.com/flairNLP/flair) and encoder-only fine-tuning is performed

for the GermEval 2014 NER dataset. The recently released [T5 models for German](https://huggingface.co/GermanT5) are

used as LM backbones.

# Results

We perform a basic hyper-parameter search over and report micro F1-Score, averaged over 5 runs (with different seeds).

Score in brackets indicates result on development split.

| Model Size                                                                      | Configuration        | Run 1           | Run 2           | Run 3           | Run 4           | Run 5           | Avg.

| ------------------------------------------------------------------------------- | -------------------- | --------------- | --------------- | --------------- | --------------- | --------------- | ---------------

| [Small](https://huggingface.co/GermanT5/t5-efficient-gc4-all-german-small-el32) | `bs16-e10-lr0.00011` | (87.24) / 85.53 | (86.40) / 85.63 | (86.50) / 85.47 | (86.32) / 85.57 | (86.77) / 85.38 | (86.65) / 85.52

| [Large](https://huggingface.co/GermanT5/t5-efficient-gc4-all-german-large-nl36) | `bs16-e10-lr0.00011` | (87.16) / 86.46 | (87.07) / 85.76 | (87.46) / 85.57 | (87.05) / 86.91 | (87.15) / 86.11 | (87.18) / 86.16

For hyper-parameter search, the script `flair-fine-tuner.py` is used in combination with a configuration file (passed as argument).

All configuration files are located under `./configs` that were used for the experiments here.

## Baselines:

* [Fine-tuned DistilBERT](https://huggingface.co/dbmdz/flair-distilbert-ner-germeval14) models reports (86.84) / 85.62.

* [GELECTRA Base](https://aclanthology.org/2020.coling-main.598/) reports 86.02 on test set.

* Current SOTA is [GELECTRA Large](https://aclanthology.org/2020.coling-main.598/) with 88.95 on test set.

# Hardware/Requirements

Latest Flair version (commit [6da65a4](https://github.com/flairNLP/flair/tree/6da65a4f665bb333f639067bc5f868046880e63b)) is used for experiments.

All models are fine-tuned on A10 (24GB) instances from [Lambda Cloud](https://lambdalabs.com/service/gpu-cloud).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/stefan-it/germeval-ner-t5

Awesome Lists containing this project

README