Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/megagonlabs/t5-japanese
Codes to pre-train Japanese T5 models
https://github.com/megagonlabs/t5-japanese
natural-language-processing nlp t5 transformer
Last synced: 3 days ago
JSON representation
Codes to pre-train Japanese T5 models
- Host: GitHub
- URL: https://github.com/megagonlabs/t5-japanese
- Owner: megagonlabs
- License: apache-2.0
- Created: 2021-08-24T04:18:30.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2021-09-07T05:11:08.000Z (about 3 years ago)
- Last Synced: 2024-08-02T13:23:55.579Z (3 months ago)
- Topics: natural-language-processing, nlp, t5, transformer
- Language: Python
- Homepage: https://huggingface.co/megagonlabs/t5-base-japanese-web
- Size: 65.4 KB
- Stars: 39
- Watchers: 4
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
- awesome-japanese-llm - Megagon Labs T5 - base-japanese-web)) | | Japanese mC4 (87,425,304 ページ (782 GB))<br>+ Japanese wiki40b (828,236 記事 (2 GB)) | Megagon Labs <br> (リクルート) | Apache 2.0 | (テキスト生成に主に使うモデル / フルスクラッチ学習モデル)
README
# t5-japanese
[![CircleCI](https://circleci.com/gh/megagonlabs/t5-japanese/tree/master.svg?style=svg)](https://circleci.com/gh/megagonlabs/t5-japanese/tree/master)
[![Typos](https://github.com/megagonlabs/t5-japanese/actions/workflows/typos.yml/badge.svg)](https://github.com/megagonlabs/t5-japanese/actions/workflows/typos.yml)
[![Shellcheck](https://github.com/megagonlabs/t5-japanese/actions/workflows/shellcheck.yml/badge.svg)](https://github.com/megagonlabs/t5-japanese/actions/workflows/shellcheck.yml)Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts.
The following is a list of models that we have published.- [megagonlabs/t5-base-japanese-web (32k)](https://huggingface.co/megagonlabs/t5-base-japanese-web)
- [megagonlabs/t5-base-japanese-web-8k (8k)](https://huggingface.co/megagonlabs/t5-base-japanese-web-8k)## Documents
- [pretrain of T5 with TPU](docs/mC4_wiki40b.md)
## Links
- Repositories
- [T5](https://github.com/google-research/text-to-text-transfer-transformer)
- [mT5](https://github.com/google-research/multilingual-t5)
- Related models
- [日本語T5事前学習済みモデル (sonoisa/t5-base-japanese)](https://huggingface.co/sonoisa/t5-base-japanese)
- [日本語T5事前学習済みモデル (sonoisa/t5-base-japanese-mC4-Wikipedia)](https://huggingface.co/sonoisa/t5-base-japanese-mC4-Wikipedia)
- Articles
- [第7回 T5 によるテキスト生成の検証 (2020年2月26日)](https://www.ogis-ri.co.jp/otc/hiroba/technical/similar-document-search/part7.html)
- [第8回 続・T5 によるテキスト生成の検証 (2020年4月23日)](https://www.ogis-ri.co.jp/otc/hiroba/technical/similar-document-search/part8.html)
- [第14回 Hugging Face Transformers で T5 を使ってみる (2021年4月22日)](https://www.ogis-ri.co.jp/otc/hiroba/technical/similar-document-search/part14.html)## License
Apache License 2.0