https://github.com/dcrebbin/canto-frierenbench

LLM benchmark for the first episode of Cantonese dubbed Frieren. Spoken subtitles by CantoCaptions.com.
https://github.com/dcrebbin/canto-frierenbench

Last synced: 5 months ago
JSON representation

LLM benchmark for the first episode of Cantonese dubbed Frieren. Spoken subtitles by CantoCaptions.com.

Host: GitHub
URL: https://github.com/dcrebbin/canto-frierenbench
Owner: dcrebbin
Created: 2025-02-21T21:02:45.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-21T21:29:57.000Z (over 1 year ago)
Last Synced: 2025-02-21T22:19:50.400Z (over 1 year ago)
Size: 132 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Canto Frierenbench



An LLM benchmark based on the first episode of Cantonese dubbed Friren to test the generation of [langpal subtitle](https://langpal.com.hk/subtitles) formatted SRT files that have duel Cantonese and English.

Expected Format:

```

1

00:00:00,672 --> 00:00:04,960

(yue)粵文字幕:

www.cantocaptions.com

(en) Cantonese subtitles:

www.cantocaptions.com

2

00:01:30,406 --> 00:01:33,659

(yue)（遙遠嘅大陸北方盡頭）

(en) (The northern end of the distant continent)

```

The extension uses the language prefixs to correctly display duel subtitles on a variety of different platforms.

The original spoken Cantonese subtitles are provided by [www.cantocaptions.com](https://cantocaptions.com)

The prompt to put into any given LLM is available here at `./frieren-prompt.txt`.

## Breakdown

| Model                         | Time taken | Extra Formatting Need? | One shot? | Accuracy | Notes  | Cost |

| ----------------------------- | ---------- | ---------------------- | --------- | -------- | ------ | ---- |

| Anthropic Sonnet 3.5          |            | YES                    | NO        |          |        |

| Anthropic Sonnet 3.7          |            | YES                    | NO        |          |        |

| Anthropic Sonnet 3.7 Thinking |            | NO                     | YES       |          |        |

| Google Flash Thinking (exp)   |            | NO                     | YES       |          |        |

| OpenAI o1                     |            | NO                     | YES       |          |        |

| OpenAI o3-mini                |            | NO                     | YES       |          |        |

| OpenAI o3-mini-high           |            | NO                     | YES       |          |        |

| xAI Grok 3                    |            | NO                     | YES       |          |        |

| DeepSeek R1                   |            | NO                     | NO        |          |        |

| Alibaba Qwen 2.5 Max          | -          | -                      | -         | -        | Failed |

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dcrebbin/canto-frierenbench

Awesome Lists containing this project

README