{"id":44623564,"url":"https://github.com/dbmdz/berts","last_synced_at":"2026-02-14T14:37:12.674Z","repository":{"id":39375082,"uuid":"209994578","full_name":"dbmdz/berts","owner":"dbmdz","description":"DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models","archived":false,"fork":false,"pushed_at":"2022-12-06T21:01:32.000Z","size":44,"stargazers_count":149,"open_issues_count":18,"forks_count":10,"subscribers_count":14,"default_branch":"main","last_synced_at":"2023-11-07T18:17:23.080Z","etag":null,"topics":["bert","bert-model","electra","german","transformers"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dbmdz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-09-21T13:58:28.000Z","updated_at":"2023-10-30T12:26:17.000Z","dependencies_parsed_at":"2023-01-23T15:16:38.809Z","dependency_job_id":null,"html_url":"https://github.com/dbmdz/berts","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/dbmdz/berts","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dbmdz%2Fberts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dbmdz%2Fberts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dbmdz%2Fberts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dbmdz%2Fberts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dbmdz","download_url":"https://codeload.github.com/dbmdz/berts/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dbmdz%2Fberts/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29447410,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-14T14:10:32.461Z","status":"ssl_error","status_checked_at":"2026-02-14T14:09:49.945Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","bert-model","electra","german","transformers"],"created_at":"2026-02-14T14:37:10.785Z","updated_at":"2026-02-14T14:37:12.667Z","avatar_url":"https://github.com/dbmdz.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤗 + 📚 dbmdz BERT models\n\nIn this repository the MDZ Digital Library team (dbmdz) at the Bavarian State\nLibrary open sources another BERT models 🎉\n\n# Changelog\n\n* 13.12.2021: Public release of Historic Language Model for Dutch.\n* 06.12.2021: Public release of smaller multilingual Historic Language Models.\n* 18.11.2021: Public release of multilingual and monolingual Historic Language Models.\n* 24.09.2021: Public release of cased/uncased Turkish ELECTRA and ConvBERT models, trained on mC4 corpus.\n* 17.08.2021: Public release of re-trained German GPT-2 model.\n* 24.06.2021: Public release of Turkish ELECTRA model, trained on Turkish part of multilingual C4 corpus.\n* 16.03.2021: Public release of ConvBERT model for Turkish: *ConvBERTurk*.\n* 06.02.2021: Public release of German Europeana DistilBERT and ConvBERT models.\n* 16.11.2020: Public release of French Europeana BERT and ELECTRA models.\n* 15.11.2020: Public release of a German GPT-2 model.\n* 11.11.2020: Public release of Ukrainian ELECTRA model.\n* 02.11.2020: Public release of Italian XXL ELECTRA model.\n* 26.10.2020: In collaboration with [Branden Chan](https://github.com/brandenchan) and [Timo Möller](https://github.com/Timoeller) from [deepset](https://deepset.ai/) we've trained larger language models for German. See our [paper](https://arxiv.org/abs/2010.10906) for more information!\n* 12.05.2020: Public release of small and base ELECTRA models for Turkish\n* 25.03.2020: Public release of *BERTurk* uncased model and *BERTurk* models with larger vocab size (128k, cased and uncased)\n* 11.03.2020: Public release of cased distilled BERT model for Turkish: *DistilBERTurk*\n* 17.02.2020: Public release of cased BERT model for Turkish: *BERTurk*\n* 10.02.2020: Public release of cased and uncased BERT models for Historic German: German Europeana BERT\n* 20.01.2019: Public release of cased and uncased XXL BERT models for Italian. They can be downloaded from\n              the [Huggingface model hub](https://huggingface.co/dbmdz).\n* 30.12.2019: Public release of cased and uncased BERT models for Italian.\n* 08.12.2019: If you consider using our model for the upcoming GermEval 2020 shared task,\n              please read at least this [blog post](https://medium.com/@emilymenonbender/is-there-research-that-shouldnt-be-done-is-there-research-that-shouldn-t-be-encouraged-b1bf7d321bb6)\n              by Emily Bender on ethical issues!\n* 10.10.2019: Public release\n* 24.09.2019: Initial version\n\n# German BERT\n\n## Stats\n\nIn addition to the recently released [German BERT](https://deepset.ai/german-bert)\nmodel by [deepset](https://deepset.ai/) we provide another German-language model.\n\nThe source data for the model consists of a recent Wikipedia dump, EU Bookshop corpus,\nOpen Subtitles, CommonCrawl, ParaCrawl and News Crawl. This results in a dataset with\na size of 16GB and 2,350,234,427 tokens.\n\nFor sentence splitting, we use [spacy](https://spacy.io/). Our preprocessing steps\n(sentence piece model for vocab generation) follow those used for training\n[SciBERT](https://github.com/allenai/scibert). The model is trained with an initial\nsequence length of 512 subwords and was performed for 1.5M steps.\n\nThis release includes both cased and uncased models.\n\n## Model weights\n\nCurrently only PyTorch-[Transformers](https://github.com/huggingface/transformers)\ncompatible weights are available. If you need access to TensorFlow checkpoints,\nplease raise an issue!\n\n| Model                            | Downloads\n| -------------------------------- | ---------------------------------------------------------------------------------------------------------------\n| `bert-base-german-dbmdz-cased`   | [`config.json`](https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-config.json) • [`pytorch_model.bin`](https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-pytorch_model.bin) • [`vocab.txt`](https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-vocab.txt)\n| `bert-base-german-dbmdz-uncased` | [`config.json`](https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-config.json) • [`pytorch_model.bin`](https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-pytorch_model.bin) • [`vocab.txt`](https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-vocab.txt)\n\n## Usage\n\nWith Transformers \u003e= 2.3 our German BERT models can be loaded like:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/bert-base-german-cased\")\nmodel = AutoModel.from_pretrained(\"dbmdz/bert-base-german-cased\")\n```\n\n## Results\n\nFor results on downstream tasks like NER or PoS tagging, please refer to\n[this repository](https://github.com/stefan-it/fine-tuned-berts-seq).\n\n# Italian BERT\n\nThe source data for the Italian BERT model consists of a recent Wikipedia dump and\nvarious texts from the [OPUS corpora](http://opus.nlpl.eu/) collection. The final\ntraining corpus has a size of 13GB and 2,050,057,573 tokens.\n\nFor sentence splitting, we use NLTK (faster compared to spacy).\nOur cased and uncased models are training with an initial sequence length of 512\nsubwords for ~2-3M steps.\n\nFor the XXL Italian models, we use the same training data from OPUS and extend\nit with data from the Italian part of the [OSCAR corpus](https://traces1.inria.fr/oscar/).\nThus, the final training corpus has a size of 81GB and 13,138,379,147 tokens.\n\nNote: Unfortunately, a wrong vocab size was used when training the XXL models.\nThis explains the mismatch of the \"real\" vocab size of 31102, compared to the\nvocab size specified in `config.json`. However, the model is working and all\nevaluations were done under those circumstances.\nSee [this issue](https://github.com/dbmdz/berts/issues/7) for more information.\n\nThe Italian ELECTRA model was trained on the \"XXL\" corpus for 1M steps in total using a batch\nsize of 128. We pretty much following the ELECTRA training procedure as used for\n[BERTurk](https://github.com/stefan-it/turkish-bert/tree/master/electra).\n\n## Model weights\n\nCurrently only PyTorch-[Transformers](https://github.com/huggingface/transformers)\ncompatible weights are available. If you need access to TensorFlow checkpoints,\nplease raise an issue!\n\n| Model                                                | Downloads\n| ---------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------\n| `dbmdz/bert-base-italian-cased`                      | [`config.json`](https://cdn.huggingface.co/dbmdz/bert-base-italian-cased/config.json)                                               • [`pytorch_model.bin`](https://cdn.huggingface.co/dbmdz/bert-base-italian-cased/pytorch_model.bin)                      • [`vocab.txt`](https://cdn.huggingface.co/dbmdz/bert-base-italian-cased/vocab.txt)\n| `dbmdz/bert-base-italian-uncased`                    | [`config.json`](https://cdn.huggingface.co/dbmdz/bert-base-italian-uncased/config.json)                                             • [`pytorch_model.bin`](https://cdn.huggingface.co/dbmdz/bert-base-italian-uncased/pytorch_model.bin)                    • [`vocab.txt`](https://cdn.huggingface.co/dbmdz/bert-base-italian-uncased/vocab.txt)\n| `dbmdz/bert-base-italian-xxl-cased`                  | [`config.json`](https://cdn.huggingface.co/dbmdz/bert-base-italian-xxl-cased/config.json)                                           • [`pytorch_model.bin`](https://cdn.huggingface.co/dbmdz/bert-base-italian-xxl-cased/pytorch_model.bin)                  • [`vocab.txt`](https://cdn.huggingface.co/dbmdz/bert-base-italian-xxl-cased/vocab.txt)\n| `dbmdz/bert-base-italian-xxl-uncased`                | [`config.json`](https://cdn.huggingface.co/dbmdz/bert-base-italian-xxl-uncased/config.json)                                         • [`pytorch_model.bin`](https://cdn.huggingface.co/dbmdz/bert-base-italian-xxl-uncased/pytorch_model.bin)                • [`vocab.txt`](https://cdn.huggingface.co/dbmdz/bert-base-italian-xxl-uncased/vocab.txt)\n| `dbmdz/electra-base-italian-xxl-cased-discriminator` | [`config.json`](https://s3.amazonaws.com/models.huggingface.co/bert/dbmdz/electra-base-italian-xxl-cased-discriminator/config.json) • [`pytorch_model.bin`](https://cdn.huggingface.co/dbmdz/electra-base-italian-xxl-cased-discriminator/pytorch_model.bin) • [`vocab.txt`](https://cdn.huggingface.co/dbmdz/electra-base-italian-xxl-cased-discriminator/vocab.txt)\n| `dbmdz/electra-base-italian-xxl-cased-generator`     | [`config.json`](https://s3.amazonaws.com/models.huggingface.co/bert/dbmdz/electra-base-italian-xxl-cased-generator/config.json)     • [`pytorch_model.bin`](https://cdn.huggingface.co/dbmdz/electra-base-italian-xxl-cased-generator/pytorch_model.bin)     • [`vocab.txt`](https://cdn.huggingface.co/dbmdz/electra-base-italian-xxl-cased-generator/vocab.txt)\n\n## Results\n\nFor results on downstream tasks like NER or PoS tagging, please refer to\n[this repository](https://github.com/stefan-it/italian-bertelectra).\n\n## Usage\n\nWith Transformers \u003e= 2.3 our Italian BERT models can be loaded like:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/bert-base-italian-cased\")\nmodel = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-cased\")\n```\n\nTo load the (recommended) Italian XXL BERT models, just use:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/bert-base-italian-xxl-cased\")\nmodel = AutoModel.from_pretrained(\"dbmdz/bert-base-italian-xxl-cased\")\n```\n\n# German Europeana BERT, DistilBERT, ELECTRA and ConvBERT\n\nWe use the open source [Europeana newspapers](http://www.europeana-newspapers.eu/)\nthat were provided by *The European Library*. The final\ntraining corpus has a size of 51GB and consists of 8,035,986,369 tokens.\n\nDetailed information about the data and pretraining steps can be found in\n[this repository](https://github.com/stefan-it/europeana-bert).\n\n## Model weights\n\nThe following models are available from the Hugging Face model hub:\n\n| Model                                                     | Downloads\n| --------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------\n| `dbmdz/bert-base-german-europeana-cased`                  | See [model hub](https://huggingface.co/dbmdz/bert-base-german-europeana-cased)\n| `dbmdz/bert-base-german-europeana-uncased`                | See [model hub](https://huggingface.co/dbmdz/bert-base-german-europeana-uncased)\n| `dbmdz/electra-base-german-europeana-cased-discriminator` | See [model hub](https://huggingface.co/dbmdz/electra-base-german-europeana-cased-discriminator)\n| `dbmdz/electra-base-german-europeana-cased-generator`     | See [model hub](https://huggingface.co/dbmdz/electra-base-german-europeana-cased-generator)\n| `dbmdz/convbert-base-german-europeana-cased`              | See [model hub](https://huggingface.co/dbmdz/convbert-base-german-europeana-cased)\n| `dbmdz/distilbert-base-german-europeana-cased`            | See [model hub](https://huggingface.co/dbmdz/distilbert-base-german-europeana-cased)\n\n## Results\n\nFor results on Historic NER, please refer to [this repository](https://github.com/stefan-it/europeana-bert).\n\n## Usage\n\nWith Transformers \u003e= 2.3 our German Europeana BERT models can be loaded like:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/bert-base-german-europeana-cased\")\nmodel = AutoModel.from_pretrained(\"dbmdz/bert-base-german-europeana-cased\")\n```\n\nThe German Europeana BERT uncased model can be loaded like:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/bert-base-german-europeana-uncased\")\nmodel = AutoModel.from_pretrained(\"dbmdz/bert-base-german-europeana-uncased\")\n```\n\n# French Europeana BERT and ELECTRA\n\nWe use the open source [Europeana newspapers](http://www.europeana-newspapers.eu/)\nthat were provided by *The European Library*. The final\ntraining corpus has a size of 63GB and consists of 11,052,528,456 tokens.\n\nDetailed information about the data and pretraining steps can be found in\n[this repository](https://github.com/stefan-it/europeana-bert).\n\n## Model weights\n\n| Model                                                     | Downloads\n| --------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------\n| `dbmdz/bert-base-french-europeana-cased`                  | See [model hub](https://huggingface.co/dbmdz/bert-base-french-europeana-cased)\n| `dbmdz/electra-base-french-europeana-cased-discriminator` | See [model hub](https://huggingface.co/dbmdz/electra-base-french-europeana-cased-discriminator)\n| `dbmdz/electra-base-french-europeana-cased-generator`     | See [model hub](https://huggingface.co/dbmdz/electra-base-french-europeana-cased-generator)\n\n## Usage\n\nWith Transformers \u003e= 2.3 our French Europeana BERT and ELECTRA models can be loaded like:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\nmodel_name = \"dbmdz/bert-base-french-europeana-cased\"\n\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModel.from_pretrained(model_name)\n```\n\nThe ELECTRA (discriminator) model can be used with:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\nmodel_name = \"dbmdz/electra-base-french-europeana-cased-discriminator\"\n\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModel.from_pretrained(model_name)\n```\n\n# Turkish BERT: BERTurk, DistilBERTurk, ELECTRA and ConvBERTurk\n\nBERTurk are community-driven cased models for Turkish.\n\nSome datasets used for pretraining and evaluation are contributed from the\nawesome Turkish NLP community, as well as the decision for the BERT model name: BERTurk.\n\nThe final training corpus has a size of 35GB and 44,04,976,662 tokens.\n\nDetailed information about the data and pretraining steps can be found in\n[this repository](https://github.com/stefan-it/turkish-bert).\n\nAdditionally, we trained a distilled version of BERTurk: *DistilBERTurk*, that\nuses knowledge-distillation from BERTurk (teacher model). More information on\ndistillation can be found in the excellent [\"DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter\"](https://arxiv.org/abs/1910.01108)\npaper by Sanh et al. (2019).\n\nFurthermore, we provide cased and uncased models trained with a larger vocab size (128k instead of 32k).\n\nWe also trained small and base ELECTRA models. ELECTRA is a new method for self-supervised language\nrepresentation learning. More details about ELECTRA can be found in the\n[ICLR paper](https://openreview.net/forum?id=r1xMH1BtvB).\n\nIn addition to the BERT and ELECTRA based models, we also trained a ConvBERT model. The ConvBERT architecture is presented\nin the [\"ConvBERT: Improving BERT with Span-based Dynamic Convolution\"](https://arxiv.org/abs/2008.02496) paper.\n\nEvaluation of our models can be found in\n[this repository](https://github.com/stefan-it/turkish-bert/electra).\n\nWe've also trained an ELECTRA (cased) model on the recently released Turkish part of the\n[multiligual C4 (mC4) corpus](https://github.com/allenai/allennlp/discussions/5265) from the AI2 team.\n\nAfter filtering documents with a broken encoding, the training corpus has a size of 242GB resulting\nin 31,240,963,926 tokens.\n\n## Model weights\n\nAll trained models can be used from the [DBMDZ](https://github.com/dbmdz) Hugging Face [model hub page](https://huggingface.co/dbmdz)\nusing their model name. The following models are available:\n\n* *BERTurk* models with 32k vocabulary: `dbmdz/bert-base-turkish-cased` and `dbmdz/bert-base-turkish-uncased`\n* *BERTurk* models with 128k vocabulary: `dbmdz/bert-base-turkish-128k-cased` and `dbmdz/bert-base-turkish-128k-uncased`\n* *ELECTRA* small and base cased models (discriminator): `dbmdz/electra-small-turkish-cased-discriminator` and `dbmdz/electra-base-turkish-cased-discriminator`\n* *ELECTRA* base cased and uncased models, trained on Turkish part of mC4 corpus (discriminator): `dbmdz/electra-small-turkish-mc4-cased-discriminator` and `dbmdz/electra-small-turkish-mc4-uncased-discriminator`\n* *ConvBERTurk* model with 32k vocabulary: `dbmdz/convbert-base-turkish-cased`\n* *ConvBERTurk* base cased and uncased models, trained on Turkish part of mC4 corpus: `dbmdz/convbert-base-turkish-mc4-cased` and `dbmdz/convbert-base-turkish-mc4-uncased`\n\n## Results\n\nFor results on PoS tagging or NER tasks, please refer to [this repository](https://github.com/stefan-it/turkish-bert).\n\n## Usage\n\nWith Transformers \u003e= 2.3 our BERTurk cased model can be loaded like:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/bert-base-turkish-cased\")\nmodel = AutoModel.from_pretrained(\"dbmdz/bert-base-turkish-cased\")\n```\n\nThe DistilBERTurk model can be loaded with:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/distilbert-base-turkish-cased\")\nmodel = AutoModel.from_pretrained(\"dbmdz/distilbert-base-turkish-cased\")\n```\n\nOur ELECTRA models can be used with Transformers \u003e= 2.8 and can be loaded with:\n\n```python\nfrom transformers import AutoModelWithLMHead, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/electra-base-turkish-cased-discriminator\")\nmodel = AutoModelWithLMHead.from_pretrained(\"dbmdz/electra-base-turkish-cased-discriminator\")\n```\n\nand\n\n```python\nfrom transformers import AutoModelWithLMHead, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/electra-base-turkish-mc4-cased-discriminator\")\nmodel = AutoModelWithLMHead.from_pretrained(\"dbmdz/electra-base-turkish-mc4-cased-discriminator\")\n```\n\nOur ConvBERT model can be used with Transformers \u003e= 4.3 and can be loaded with:\n\n```python\nfrom transformers import AutoModelWithLMHead, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"dbmdz/convbert-base-turkish-cased\")\nmodel = AutoModelWithLMHead.from_pretrained(\"dbmdz/convbert-base-turkish-cased\")\n```\n\n# Ukrainian ELECTRA\n\nThe source data for the Ukrainian ELECTRA model consists of two corpora:\n\n* Recent Wikipedia dump\n* Deduplicated Ukrainian part from the [OSCAR](https://oscar-corpus.com/) corpus\n\nThe final training corpus has a size of 30GB and consits of exactly 2,402,761,324 tokens.\n\nDetailed information about the data and pretraining steps can be found in\n[this repository](https://github.com/stefan-it/ukrainian-electra).\n\n## Model weights\n\nCurrently only PyTorch-[Transformers](https://github.com/huggingface/transformers)\ncompatible weights are available. If you need access to TensorFlow checkpoints,\nplease raise an issue!\n\n| Model                                              | Downloads\n| -------------------------------------------------- | --------------------------------------------------------------------------------------------------\n| `dbmdz/electra-base-ukrainian-cased-discriminator` | See [model hub](https://huggingface.co/dbmdz/electra-base-ukrainian-cased-discriminator/tree/main)\n| `dbmdz/electra-base-ukrainian-cased-generator`     | See [model hub](https://huggingface.co/dbmdz/electra-base-ukrainian-cased-generator/tree/main)\n\n## Results\n\nFor results on PoS tagging and NER downstream tasks, please refer to [this repository](https://github.com/stefan-it/ukrainian-electra).\n\n## Usage\n\nWith Transformers \u003e= 2.3 our Ukrainian ELECTRA model can be loaded like:\n\n```python\nfrom transformers import AutoModel, AutoTokenizer\n\nmodel_name = \"dbmdz/electra-base-ukrainian-cased-discriminator\"\n\ntokenizer = AutoTokenizer.from_pretrained(model_name)\n\nmodel = AutoModelWithLMHead.from_pretrained(model_name)\n```\n\n# German GPT-2 model\n\nThe German GPT-2 model is meant to be an entry point for fine-tuning on other texts, and it is definitely not as good or \"dangerous\"\nas the English GPT-3 model.\n\nFor training we use pretty much the same corpora as used for training the DBMDZ BERT model. We created a 50K byte-level BPE vocab based\non the training corpora.\n\nThe model was trained on one v3-8 TPU over the whole training corpus for 20 epochs.\n\nDetailed information can be found in [this repository](https://github.com/stefan-it/german-gpt2).\n\n**Note**: we have released a re-trained version of this model with better results!\n\n## Model weights\n\nIn addition to the German GPT-2 model, we release a GPT-2 model, that was fine-tuned on a normalized version of Faust I and II.\n\n| Model                                 | Downloads\n| ------------------------------------- | --------------------\n| `dbmdz/german-gpt2`                   | See [model hub](https://huggingface.co/dbmdz/german-gpt2/tree/main)\n| `dbmdz/german-gpt2-faust` (old model) | See [model hub](https://huggingface.co/dbmdz/german-gpt2-faust/tree/main)\n\n## Usage\n\nWith Transformers \u003e= 2.3 our German GPT-2 model can be used for text generation:\n\n```python\nfrom transformers import pipeline\n\npipe = pipeline('text-generation', model=\"dbmdz/german-gpt2\",\n                 tokenizer=\"dbmdz/german-gpt2\", config={'max_length':800})\n\ntext = pipe2(\"Der Sinn des Lebens ist es\")[0][\"generated_text\"]\n\nprint(text)\n```\n\n# Historic Language Models\n\nWe release several BERT-based language models, incl. a multilingual Historic language models that includes\nGerman, French, English, Finnish and Swedish, as well monolingual Historic language models for English,\nFinnish and Swedish. The multilingual Historic language model was trained on 130GB of texts, extracted\nfrom Europeana Newspapers and British Library corpus.\n\nMore details about our Historic Language Models can be found in\n[this repository](https://github.com/stefan-it/clef-hipe/blob/main/hlms.md).\n\n## Model weights\n\nAll models are available on the Hugging Face model hub:\n\n| Model identifier                              | Model Hub link\n| --------------------------------------------- | --------------------------------------------------------------------------\n| `dbmdz/bert-base-historic-multilingual-cased` | [here](https://huggingface.co/dbmdz/bert-base-historic-multilingual-cased)\n| `dbmdz/bert-base-historic-english-cased`      | [here](https://huggingface.co/dbmdz/bert-base-historic-english-cased)\n| `dbmdz/bert-base-finnish-europeana-cased`     | [here](https://huggingface.co/dbmdz/bert-base-finnish-europeana-cased)\n| `dbmdz/bert-base-swedish-europeana-cased`     | [here](https://huggingface.co/dbmdz/bert-base-swedish-europeana-cased)\n\nWe also released smaller Historic Language Models:\n\n| Model identifier                                | Model Hub link\n| ----------------------------------------------- | ---------------------------------------------------------------------------\n| `dbmdz/bert-tiny-historic-multilingual-cased`   | [here](https://huggingface.co/dbmdz/bert-tiny-historic-multilingual-cased)\n| `dbmdz/bert-mini-historic-multilingual-cased`   | [here](https://huggingface.co/dbmdz/bert-mini-historic-multilingual-cased)\n| `dbmdz/bert-small-historic-multilingual-cased`  | [here](https://huggingface.co/dbmdz/bert-small-historic-multilingual-cased)\n| `dbmdz/bert-medium-historic-multilingual-cased` | [here](https://huggingface.co/dbmdz/bert-base-historic-multilingual-cased)\n\n# Historic Dutch\n\nWe train a language model on the\n[Delpher Corpus](https://www.delpher.nl/over-delpher/delpher-open-krantenarchief/download-teksten-kranten-1618-1879),\nthat includes digitized texts from Dutch newspapers, ranging from 1618 to 1879.\n\nThe total training corpus consists of 427,181,269 sentences and 3,509,581,683 tokens (counted via `wc`),\nresulting in a total corpus size of 21GB.\n\nMore details about the Historic Dutch language model can be found in\n[this repository](https://github.com/stefan-it/delpher-lm).\n\n## Model weights\n\nThe following models for Historic Dutch are available on the Hugging Face Model Hub:\n\n| Model identifier                       | Model Hub link\n| -------------------------------------- | -------------------------------------------------------------------\n| `dbmdz/bert-base-historic-dutch-cased` | [here](https://huggingface.co/dbmdz/bert-base-historic-dutch-cased)\n\n# License\n\nAll models are licensed under [MIT](LICENSE).\n\n# Huggingface model hub\n\nAll models are available on the [Huggingface model hub](https://huggingface.co/dbmdz).\n\n# Papers\n\n[Here you can find a list papers](papers.md), that used one of our trained models.\nFeel free to open a PR/issue if you want your paper to be included!\n\n# Contact (Bugs, Feedback, Contribution and more)\n\nFor questions about our BERT models just open an issue\n[here](https://github.com/dbmdz/berts/issues/new) 🤗\n\n# Acknowledgments\n\nResearch supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC).\nThanks for providing access to the TFRC ❤️\n\nThanks to the generous support from the [Hugging Face](https://huggingface.co/) team,\nit is possible to download both cased and uncased models from their S3 storage 🤗\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdbmdz%2Fberts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdbmdz%2Fberts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdbmdz%2Fberts/lists"}