https://github.com/zwhe99/llm-mt-eval

{DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}
https://github.com/zwhe99/llm-mt-eval

Last synced: 3 months ago
JSON representation

{DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}

Host: GitHub
URL: https://github.com/zwhe99/llm-mt-eval
Owner: zwhe99
Created: 2023-06-13T10:39:07.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-06-18T11:01:11.000Z (almost 2 years ago)
Last Synced: 2025-01-30T06:42:26.743Z (5 months ago)
Language: Smalltalk
Homepage:
Size: 45.1 MB
Stars: 14
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # LLM-MT-Eval

This repo evaluates

* DeepL

* Google Trans

* WMT22 Best

* text-davinci-003

* gpt-3.5-turbo-0301

* gpt-4-0314

in automatic metrics:

* COMET

* BLEURT

* BLEU

* chrF

* chrF++

 on WMT22 general translation tasks:

* English<->German

* English<->Czech

* English<->Russian

* English<->Chinese

* German<->French

* English<->Japanese

* Ukrainian<->English

* Ukrainian<->Czech

* English->Croatian

and WMT21 news translation tasks:

* English<->Hausa

* English<->Icelandic

### Results

#### System outputs

```

output/

|-- deepl

|-- google-cloud

|-- gpt-3.5-turbo-0301

|-- gpt-4-0314

|-- text-davinci-003

`-- wmt-winner

```

#### Full results







#### **Average performance**

**All language pairs** （except for those not supported by DeepL）







**High resource**

* En<->De, En<->Cs, En<->Ru, En<->Zh







**Medium resource**

* De<->Fr, En<->Uk, En<->Ja







**Low resource**

* Uk<->Cs, En<->Hr, En<->Ha, En<->Is







### Evaluation

```sh

wget https://storage.googleapis.com/bleurt-oss-21/BLEURT-20.zip .

unzip BLEURT-20.zip

python3 evaluation/eval.log --bleurt-ckpt BLEURT-20

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zwhe99/llm-mt-eval

Awesome Lists containing this project

README