{"id":13695767,"url":"https://github.com/stefan-it/turkish-bert","last_synced_at":"2025-04-08T06:37:10.133Z","repository":{"id":37545649,"uuid":"237817454","full_name":"stefan-it/turkish-bert","owner":"stefan-it","description":"Turkish BERT/DistilBERT, ELECTRA, ConvBERT and T5 models","archived":false,"fork":false,"pushed_at":"2025-03-04T00:52:38.000Z","size":1836,"stargazers_count":521,"open_issues_count":16,"forks_count":43,"subscribers_count":36,"default_branch":"master","last_synced_at":"2025-04-01T04:53:29.193Z","etag":null,"topics":["bert","convbert","distilbert","electra","t5","turkish"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stefan-it.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-02-02T18:37:15.000Z","updated_at":"2025-03-28T20:04:47.000Z","dependencies_parsed_at":"2023-02-15T18:01:32.067Z","dependency_job_id":"7d2ca297-8342-4328-a387-ca69cbd80757","html_url":"https://github.com/stefan-it/turkish-bert","commit_stats":{"total_commits":73,"total_committers":3,"mean_commits":"24.333333333333332","dds":0.09589041095890416,"last_synced_commit":"03ff254552c146c1b5f66b54d0b5f725dfc70c76"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stefan-it%2Fturkish-bert","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stefan-it%2Fturkish-bert/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stefan-it%2Fturkish-bert/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stefan-it%2Fturkish-bert/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stefan-it","download_url":"https://codeload.github.com/stefan-it/turkish-bert/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247792900,"owners_count":20996892,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","convbert","distilbert","electra","t5","turkish"],"created_at":"2024-08-02T18:00:33.238Z","updated_at":"2025-04-08T06:37:10.083Z","avatar_url":"https://github.com/stefan-it.png","language":"Python","funding_links":[],"categories":["[🎓 research](https://github.com/stars/ketsapiwiq/lists/research)","Models","Python"],"sub_categories":["PHP"],"readme":"# 🇹🇷 BERTurk\n\n\u003cp align=\"center\"\u003e\n  \u003cimg alt=\"Logo provided by Merve Noyan\" title=\"Awesome logo from Merve Noyan\" src=\"https://raw.githubusercontent.com/stefan-it/turkish-bert/master/merve_logo.png\"\u003e\n\u003c/p\u003e\n\n[![DOI](https://zenodo.org/badge/237817454.svg)](https://zenodo.org/badge/latestdoi/237817454)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.14963493.svg)](https://doi.org/10.5281/zenodo.14963493)\n\nWe present community-driven BERT, DistilBERT, ELECTRA, ConvBERT and T5 models for Turkish 🎉\n\nSome datasets used for pretraining and evaluation are contributed from the\nawesome Turkish NLP community, as well as the decision for the BERT model name: BERTurk.\n\nLogo is provided by [Merve Noyan](https://twitter.com/mervenoyann).\n\n# Changelog\n\n* 03.03.2025: Release of [BERT5urk](https://huggingface.co/stefan-it/bert5urk) - pretrained on Turkish part of [FineWeb2 corpus](https://huggingface.co/datasets/HuggingFaceFW/fineweb-2).\n* 21.12.2024: New evaluations with Flair are added.\n* 23.09.2021: Release of uncased ELECTRA and ConvBERT models and cased ELECTRA model, all trained on mC4 corpus.\n* 24.06.2021: Release of new ELECTRA model, trained on Turkish part of mC4 dataset. Repository got new awesome logo from Merve Noyan.\n* 16.03.2021: Release of *ConvBERTurk* model and more evaluations on different downstream tasks.\n* 12.05.2020: Release of ELEC**TR**A ([small](https://huggingface.co/dbmdz/electra-small-turkish-cased-discriminator)\n              and [base](https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator)) models, see [here](electra/README.md).\n* 25.03.2020: Release of *BERTurk* uncased model and *BERTurk* models with larger vocab size (128k, cased and uncased).\n* 11.03.2020: Release of the cased distilled *BERTurk* model: *DistilBERTurk*.\n              Available on the [Hugging Face model hub](https://huggingface.co/dbmdz/distilbert-base-turkish-cased)\n* 17.02.2020: Release of the cased *BERTurk* model.\n              Available on the [Hugging Face model hub](https://huggingface.co/dbmdz/bert-base-turkish-cased)\n* 10.02.2020: Training corpus update, new TensorBoard links, new results for cased model.\n* 02.02.2020: Initial version of this repo.\n\n# Pretraining Corpora Stats\n\nThe current version of the model is trained on a filtered and sentence\nsegmented version of the Turkish [OSCAR corpus](https://traces1.inria.fr/oscar/),\na recent Wikipedia dump, various [OPUS corpora](http://opus.nlpl.eu/) and a\nspecial corpus provided by [Kemal Oflazer](http://www.andrew.cmu.edu/user/ko/).\n\nThe final training corpus has a size of 35GB and 4,404,976,662 tokens.\n\nThanks to Google's TensorFlow Research Cloud (TFRC) we can train both cased and\nuncased models on a TPU v3-8. You can find the TensorBoard outputs for\nthe training here:\n\n* [TensorBoard cased model](https://tensorboard.dev/experiment/ZgFk8LclQOKdW0pYWviLMg/)\n* [TensorBoard uncased model](https://tensorboard.dev/experiment/5LlD11cWRwexyqKSEPPXGA/)\n\nWe also provide cased and uncased models that aŕe using a larger vocab size (128k instead of 32k).\n\nA detailed cheatsheet of how the models were trained, can be found [here](CHEATSHEET.md).\n\n## C4 Multilingual dataset (mC4)\n\nWe've also trained an ELECTRA (cased) model on the recently released Turkish part of the\n[multiligual C4 (mC4) corpus](https://github.com/allenai/allennlp/discussions/5265) from the AI2 team.\n\nAfter filtering documents with a broken encoding, the training corpus has a size of 242GB resulting\nin 31,240,963,926 tokens.\n\nWe used the original 32k vocab (instead of creating a new one).\n\n# Turkish Model Zoo\n\nHere's an overview of all available models, incl. their training corpus size:\n\n| Model name                 | Model hub link                                                                      | Pre-training corpus size |\n|----------------------------|-------------------------------------------------------------------------------------|--------------------------|\n| ELECTRA Small (cased)      | [here](https://huggingface.co/dbmdz/electra-small-turkish-cased-discriminator)      | 35GB                     |\n| ELECTRA Base (cased)       | [here](https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator)       | 35GB                     |\n| ELECTRA Base mC4 (cased)   | [here](https://huggingface.co/dbmdz/electra-base-turkish-mc4-cased-discriminator)   | 242GB                    |\n| ELECTRA Base mC4 (uncased) | [here](https://huggingface.co/dbmdz/electra-base-turkish-mc4-uncased-discriminator) | 242GB                    |\n| BERTurk (cased, 32k)       | [here](https://huggingface.co/dbmdz/bert-base-turkish-cased)                        | 35GB                     |\n| BERTurk (uncased, 32k)     | [here](https://huggingface.co/dbmdz/bert-base-turkish-uncased)                      | 35GB                     |\n| BERTurk (cased, 128k)      | [here](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased)                   | 35GB                     |\n| BERTurk (uncased, 128k)    | [here](https://huggingface.co/dbmdz/bert-base-turkish-128k-uncased)                 | 35GB                     |\n| DistilBERTurk (cased)      | [here](https://huggingface.co/dbmdz/distilbert-base-turkish-cased)                  | 35GB                     |\n| ConvBERTurk (cased)        | [here](https://huggingface.co/dbmdz/convbert-base-turkish-cased)                    | 35GB                     |\n| ConvBERTurk mC4 (cased)    | [here](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-cased)                | 242GB                    |\n| ConvBERTurk mC4 (uncased)  | [here](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-uncased)              | 242GB                    |\n| BERT5urk                   | [here](stefan-it/bert5urk)                                                          | 262GB                    |\n\n# *DistilBERTurk*\n\nThe distilled version of a cased model, so called *DistilBERTurk*, was trained\non 7GB of the original training data, using the cased version of *BERTurk*\nas teacher model.\n\n*DistilBERTurk* was trained with the official Hugging Face implementation from\n[here](https://github.com/huggingface/transformers/tree/master/examples/distillation).\n\nThe cased model was trained for 5 days on 4 RTX 2080 TI.\n\nMore details about distillation can be found in the\n[\"DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter\"](https://arxiv.org/abs/1910.01108)\npaper by Sanh et al. (2019).\n\n# ELECTRA\n\nIn addition to the *BERTurk* models, we also trained ELEC**TR**A small and base models. A detailed overview can be found\nin the [ELECTRA section](electra/README.md).\n\n# ConvBERTurk\n\nIn addition to the BERT and ELECTRA based models, we also trained a ConvBERT model. The ConvBERT architecture is presented\nin the [\"ConvBERT: Improving BERT with Span-based Dynamic Convolution\"](https://arxiv.org/abs/2008.02496) paper.\n\nWe follow a different training procedure: instead of using a two-phase approach, that pre-trains the model for 90% with 128\nsequence length and 10% with 512 sequence length, we pre-train the model with 512 sequence length for 1M steps on a v3-32 TPU.\n\nMore details about the pre-training can be found [here](convbert/README.md).\n\n# mC4 ELECTRA\n\nIn addition to the ELEC**TR**A base model, we also trained an ELECTRA model on the Turkish part of the mC4 corpus. We use a\nsequence length of 512 over the full training time and train the model for 1M steps on a v3-32 TPU.\n\n# BERT5urk\n\nBERT5urk is a new 1.42B encoder-decoder model based on the [efficient](https://arxiv.org/abs/2109.10686) [T5 architecture](https://arxiv.org/abs/1910.10683) and\npretrained with the [UL2 objective](https://arxiv.org/abs/2205.05131).\n\nThe model was pretrained with the awesome [T5X](https://github.com/google-research/t5x) library for 2M steps with a\nbatch size of 128 and an input and output sequence length of 512 for 16.56 days on a v3-32 TPU Pod.\n\nThe Turkish part of the amazing [FineWeb2](https://huggingface.co/datasets/HuggingFaceFW/fineweb-2) is used as\npretraining corpus. Basic filtering with a minimum language score of 0.99 is performed resulting in a 262GB corpus.\n\n# Evaluation\n\nIn 2024 we ran new evaluations on PoS tagging, NER and sentiment classification datasets. Prior evaluation results can be found [here](OLD_EVALUATIONS.md).\n\nAll evaluations are performed with the awesome Flair library and the evaluation code and configs can be found in the\n[`experiments](experiments) folder of this repository.\n\n## PoS Tagging\n\nThe Model Zoo is evaluated on (the concatenation) of the following PoS Tagging datasets from Universal Dependencies:\n\n* [Atis](https://github.com/UniversalDependencies/UD_Turkish-Atis)\n* [BOUN](https://github.com/UniversalDependencies/UD_Turkish-BOUN)\n* [FrameNet](https://github.com/UniversalDependencies/UD_Turkish-FrameNet)\n* [IMST](https://github.com/UniversalDependencies/UD_Turkish-IMST)\n* [Tourism](https://github.com/UniversalDependencies/UD_Turkish-Tourism)\n\nWe perform a hyper-parameter search over the following configurations:\n\n| Parameter     | Values         |\n|---------------|----------------|\n| Batch Size    | `[16, 8]`      |\n| Learning Rate | `[3e-5, 5e-5]` |\n| Epoch         | `[3]`          |\n\nAnd report averaged Accuracy over 5 runs (with different seeds):\n\n| Model Name                                                                                                | Best Configuration | Best Development Score | Best Test Score |\n|-----------------------------------------------------------------------------------------------------------|--------------------|-----------------------:|----------------:|\n| [BERTurk (cased, 128k)](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased)                        | `bs16-e3-lr5e-05`  |           93.93 ± 0.04 |    94.50 ± 0.07 |\n| [BERTurk (uncased, 128k)](https://huggingface.co/dbmdz/bert-base-turkish-128k-uncased)                    | `bs8-e3-lr5e-05`   |           93.84 ± 0.04 |    94.41 ± 0.13 |\n| [BERTurk (cased, 32k)](https://huggingface.co/dbmdz/bert-base-turkish-cased)                              | `bs16-e3-lr5e-05`  |           93.95 ± 0.05 |    94.57 ± 0.04 |\n| [BERTurk (uncased, 32k)](https://huggingface.co/dbmdz/bert-base-turkish-uncased)                          | `bs16-e3-lr5e-05`  |           93.84 ± 0.04 |    94.38 ± 0.03 |\n| [ConvBERTurk (cased)](https://huggingface.co/dbmdz/convbert-base-turkish-cased)                           | `bs8-e3-lr5e-05`   |           94.03 ± 0.07 |    94.58 ± 0.06 |\n| [ConvBERTurk mC4 (cased)](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-cased)                   | `bs8-e3-lr5e-05`   |       **94.04** ± 0.05 |    94.59 ± 0.06 |\n| [ConvBERTurk mC4 (uncased)](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-uncased)               | `bs8-e3-lr5e-05`   |           93.90 ± 0.08 |    94.52 ± 0.04 |\n| [DistilBERTurk (cased)](https://huggingface.co/dbmdz/distilbert-base-turkish-cased)                       | `bs8-e3-lr5e-05`   |           93.52 ± 0.03 |    94.19 ± 0.04 |\n| [ELECTRA Base (cased)](https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator)             | `bs16-e3-lr5e-05`  |           93.89 ± 0.05 |    94.45 ± 0.05 |\n| [ELECTRA Base mC4 (cased)](https://huggingface.co/dbmdz/electra-base-turkish-mc4-cased-discriminator)     | `bs16-e3-lr5e-05`  |           93.88 ± 0.05 |    94.53 ± 0.11 |\n| [ELECTRA Base mC4 (uncased)](https://huggingface.co/dbmdz/electra-base-turkish-mc4-uncased-discriminator) | `bs8-e3-lr5e-05`   |           93.80 ± 0.09 |    94.41 ± 0.04 |\n| [ELECTRA Small (cased)](https://huggingface.co/dbmdz/electra-small-turkish-cased-discriminator)           | `bs8-e3-lr5e-05`   |           93.15 ± 0.04 |    93.88 ± 0.06 |\n| [BERT5urk](https://huggingface.co/stefan-it/bert5urk)                                                     | `bs8-e3-lr5e-05`   |           93.75 ± 0.04 |    94.33 ± 0.06 |\n\n## Named Entity Recognition\n\nThe Model Zoo is evaluated on the Turkish split of the WikiANN dataset, using the following hyper-parameter search:\n\n| Parameter     | Values         |\n|---------------|----------------|\n| Batch Size    | `[16, 8]`      |\n| Learning Rate | `[3e-5, 5e-5]` |\n| Epoch         | `[10]`         |\n\nAveraged F1-Score over 5 runs (with different seeds):\n\n| Model Name                                                                                                | Best Configuration | Best Development Score | Best Test Score |\n|-----------------------------------------------------------------------------------------------------------|--------------------|-----------------------:|----------------:|\n| [BERTurk (cased, 128k)](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased)                        | `bs8-e10-lr3e-05`  |           93.92 ± 0.07 |    93.92 ± 0.16 |\n| [BERTurk (uncased, 128k)](https://huggingface.co/dbmdz/bert-base-turkish-128k-uncased)                    | `bs16-e10-lr3e-05` |           93.59 ± 0.05 |    93.29 ± 0.11 |\n| [BERTurk (cased, 32k)](https://huggingface.co/dbmdz/bert-base-turkish-cased)                              | `bs8-e10-lr3e-05`  |           93.36 ± 0.04 |    93.26 ± 0.14 |\n| [BERTurk (uncased, 32k)](https://huggingface.co/dbmdz/bert-base-turkish-uncased)                          | `bs8-e10-lr3e-05`  |           93.13 ± 0.19 |    92.96 ± 0.06 |\n| [ConvBERTurk (cased)](https://huggingface.co/dbmdz/convbert-base-turkish-cased)                           | `bs8-e10-lr3e-05`  |       **93.93** ± 0.07 |    93.93 ± 0.05 |\n| [ConvBERTurk mC4 (cased)](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-cased)                   | `bs8-e10-lr3e-05`  |           93.89 ± 0.07 |    93.57 ± 0.06 |\n| [ConvBERTurk mC4 (uncased)](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-uncased)               | `bs8-e10-lr3e-05`  |           93.68 ± 0.13 |    93.58 ± 0.15 |\n| [DistilBERTurk (cased)](https://huggingface.co/dbmdz/distilbert-base-turkish-cased)                       | `bs8-e10-lr5e-05`  |           91.80 ± 0.05 |    91.17 ± 0.03 |\n| [ELECTRA Base (cased)](https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator)             | `bs8-e10-lr3e-05`  |           93.58 ± 0.12 |    93.60 ± 0.09 |\n| [ELECTRA Base mC4 (cased)](https://huggingface.co/dbmdz/electra-base-turkish-mc4-cased-discriminator)     | `bs16-e10-lr3e-05` |           93.51 ± 0.09 |    93.42 ± 0.11 |\n| [ELECTRA Base mC4 (uncased)](https://huggingface.co/dbmdz/electra-base-turkish-mc4-uncased-discriminator) | `bs16-e10-lr5e-05` |           93.01 ± 0.12 |    92.94 ± 0.13 |\n| [ELECTRA Small (cased)](https://huggingface.co/dbmdz/electra-small-turkish-cased-discriminator)           | `bs8-e10-lr5e-05`  |           91.42 ± 0.09 |    91.07 ± 0.09 |\n| [BERT5urk](https://huggingface.co/stefan-it/bert5urk)                                                     | `bs8-e10-lr5e-05`  |       **93.93** ± 0.10 |    93.66 ± 0.10 |\n\n## Sentiment Classification\n\nThe Model Zoo is additionally evaluated on the [OffensEval-TR 2020](stefan-it/offenseval2020_tr) dataset for sentiment\nclassification.\n\nThe following parameters are used for a hyper-parameter search:\n\n| Parameter     | Values         |\n|---------------|----------------|\n| Batch Size    | `[16, 8]`      |\n| Learning Rate | `[3e-5, 5e-5]` |\n| Epoch         | `[3]`          |\n\nAveraged Macro F1-Score over 5 runs (with different seeds) is reported:\n\n| Model Name                                                                                                | Best Configuration | Best Development Score | Best Test Score |\n|-----------------------------------------------------------------------------------------------------------|--------------------|-----------------------:|----------------:|\n| [BERTurk (cased, 128k)](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased)                        | `bs16-e3-lr3e-05`  |           81.30 ± 0.61 |    81.72 ± 0.47 |\n| [BERTurk (uncased, 128k)](https://huggingface.co/dbmdz/bert-base-turkish-128k-uncased)                    | `bs16-e3-lr3e-05`  |           80.31 ± 0.54 |    82.16 ± 0.27 |\n| [BERTurk (cased, 32k)](https://huggingface.co/dbmdz/bert-base-turkish-cased)                              | `bs16-e3-lr5e-05`  |           79.64 ± 0.50 |    80.65 ± 0.40 |\n| [BERTurk (uncased, 32k)](https://huggingface.co/dbmdz/bert-base-turkish-uncased)                          | `bs16-e3-lr3e-05`  |           80.87 ± 0.22 |    81.68 ± 0.37 |\n| [ConvBERTurk (cased)](https://huggingface.co/dbmdz/convbert-base-turkish-cased)                           | `bs16-e3-lr3e-05`  |       **82.22** ± 0.41 |    82.29 ± 0.34 |\n| [ConvBERTurk mC4 (cased)](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-cased)                   | `bs16-e3-lr3e-05`  |           82.16 ± 0.46 |    82.10 ± 0.30 |\n| [ConvBERTurk mC4 (uncased)](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-uncased)               | `bs16-e3-lr3e-05`  |           81.69 ± 0.29 |    81.81 ± 0.37 |\n| [DistilBERTurk (cased)](https://huggingface.co/dbmdz/distilbert-base-turkish-cased)                       | `bs16-e3-lr3e-05`  |           78.54 ± 0.55 |    79.12 ± 0.17 |\n| [ELECTRA Base (cased)](https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator)             | `bs16-e3-lr3e-05`  |           79.76 ± 0.24 |    81.69 ± 0.38 |\n| [ELECTRA Base mC4 (cased)](https://huggingface.co/dbmdz/electra-base-turkish-mc4-cased-discriminator)     | `bs8-e3-lr3e-05`   |           80.34 ± 0.67 |    82.14 ± 0.27 |\n| [ELECTRA Base mC4 (uncased)](https://huggingface.co/dbmdz/electra-base-turkish-mc4-uncased-discriminator) | `bs16-e3-lr5e-05`  |           80.46 ± 0.80 |    81.52 ± 0.56 |\n| [ELECTRA Small (cased)](https://huggingface.co/dbmdz/electra-small-turkish-cased-discriminator)           | `bs16-e3-lr5e-05`  |           77.25 ± 0.47 |    79.89 ± 0.28 |\n| [BERT5urk](https://huggingface.co/stefan-it/bert5urk)                                                     | `bs8-e3-lr0.00015` |           82.20 ± 0.88 |    82.78 ± 0.44 |\n\n## Overall\n\nThe following table shows the performance of all models over all datasets:\n\n| Model Name                                                                                                | Overall Development | Overall Test |\n|-----------------------------------------------------------------------------------------------------------|--------------------:|-------------:|\n| [BERTurk (cased, 128k)](https://huggingface.co/dbmdz/bert-base-turkish-128k-cased)                        |               89.72 |        90.05 |\n| [BERTurk (uncased, 128k)](https://huggingface.co/dbmdz/bert-base-turkish-128k-uncased)                    |               89.25 |        89.95 |\n| [BERTurk (cased, 32k)](https://huggingface.co/dbmdz/bert-base-turkish-cased)                              |               88.98 |        89.49 |\n| [BERTurk (uncased, 32k)](https://huggingface.co/dbmdz/bert-base-turkish-uncased)                          |               89.28 |        89.67 |\n| [ConvBERTurk (cased)](https://huggingface.co/dbmdz/convbert-base-turkish-cased)                           |           **90.06** |        90.27 |\n| [ConvBERTurk mC4 (cased)](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-cased)                   |               90.03 |        90.09 |\n| [ConvBERTurk mC4 (uncased)](https://huggingface.co/dbmdz/convbert-base-turkish-mc4-uncased)               |               89.76 |        89.97 |\n| [DistilBERTurk (cased)](https://huggingface.co/dbmdz/distilbert-base-turkish-cased)                       |               87.95 |        88.16 |\n| [ELECTRA Base (cased)](https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator)             |               89.08 |        89.91 |\n| [ELECTRA Base mC4 (cased)](https://huggingface.co/dbmdz/electra-base-turkish-mc4-cased-discriminator)     |               89.24 |        90.03 |\n| [ELECTRA Base mC4 (uncased)](https://huggingface.co/dbmdz/electra-base-turkish-mc4-uncased-discriminator) |               89.09 |        89.62 |\n| [ELECTRA Small (cased)](https://huggingface.co/dbmdz/electra-small-turkish-cased-discriminator)           |               87.27 |        88.28 |\n| [BERT5urk](https://huggingface.co/stefan-it/bert5urk)                                                     |               89.96 |        90.26 |\n\n# Model usage\n\nAll trained models can be used from the [DBMDZ](https://github.com/dbmdz) Hugging Face [model hub page](https://huggingface.co/dbmdz)\nusing their model name.\n\nExample usage with 🤗/Transformers:\n\n```python\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/bert-base-turkish-cased\")\n\nmodel = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-cased\")\n```\n\nThis loads the *BERTurk* cased model. The recently introduced ELEC**TR**A base model can be loaded with:\n\n```python\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/electra-base-turkish-cased-discriminator\")\n\nmodel = AutoModelWithLMHead.from_pretrained(\"dbmdz/electra-base-turkish-cased-discriminator\")\n```\n\n# Citation\n\nYou can use the following BibTeX entry for citation:\n\n```bibtex\n@software{stefan_schweter_2020_3770924,\n  author       = {Stefan Schweter},\n  title        = {BERTurk - BERT models for Turkish},\n  month        = apr,\n  year         = 2020,\n  publisher    = {Zenodo},\n  version      = {1.0.0},\n  doi          = {10.5281/zenodo.3770924},\n  url          = {https://doi.org/10.5281/zenodo.3770924}\n}\n```\n\nIf you are using newer models - such as BERT5urk - please use this BibTeX entry for citation:\n\n```bibtex\n@software{stefan_schweter_2025_14963493,\n  author       = {Stefan Schweter},\n  title        = {BERTurk v2},\n  month        = mar,\n  year         = 2025,\n  publisher    = {Zenodo},\n  version      = {2.0.0},\n  doi          = {10.5281/zenodo.14963493},\n  url          = {https://doi.org/10.5281/zenodo.14963493},\n  swhid        = {swh:1:dir:575750c0320a0fd3f1cc74cfd036fc1459f994cc\n                   ;origin=https://doi.org/10.5281/zenodo.3770923;vis\n                   it=swh:1:snp:d0c6e71f3a152fb42e82b118c2873552d63a7\n                   e96;anchor=swh:1:rel:cdd678b8b0efb7ced863bd1a0cc50\n                   5fdcd4cf34a;path=stefan-it-turkish-bert-2cd933b\n                  },\n}\n```\n\n# Acknowledgments\n\nThanks to [Kemal Oflazer](http://www.andrew.cmu.edu/user/ko/) for providing us\nadditional large corpora for Turkish. Many thanks to Reyyan Yeniterzi for providing\nus the Turkish NER dataset for evaluation.\n\nWe would like to thank [Merve Noyan](https://twitter.com/mervenoyann) for the\nawesome logo!\n\nResearch supported with Cloud TPUs from the awesome [TRC program](https://sites.research.google/trc/about/).\n\nMany thanks for providing access to the TPUs over a lot of years ❤️\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstefan-it%2Fturkish-bert","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstefan-it%2Fturkish-bert","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstefan-it%2Fturkish-bert/lists"}