Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-japanese-llm

日本語LLMまとめ - Overview of Japanese LLMs
https://github.com/llm-jp/awesome-japanese-llm

Last synced: 4 days ago
JSON representation

テキスト生成に主に使うモデル
- フルスクラッチ事前学習モデル
 - 日本語ニュースBART - base-japanese-news)) | 日本語ビジネスニュース記事（約2,100万記事 (2.9億文)） | ストックマーク | MIT |
 - Tanuki-8B - llm-team/Tanuki-8B), [**8B**-Before-Context-Length-Extension](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Before-Context-Length-Extension), [**8B**-Instruct-without-DPO](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Instruct-without-DPO), [**8B**-Instruct](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Instruct)) | 8,192 （8B-Before-Context-Length-Extension のみ 2,048） | 事前学習: 日本語 CommonCrawl, 様々な日本語コーパス, 英語論文, 英語 Wikipedia, コードデータ （計 **280B** トークン） Instruction Tuning: MinnadeChat, LLMChat, 合成データ [^15] | 松尾研LLM開発プロジェクト Team たぬき | Apache 2.0 |
- 海外モデルに日本語で継続事前学習を行ったモデル
 - lightblue/japanese-mpt-7b
 - NovelAI/genji-jp - J (**6b**) | NovelAI | ？ |
 - KARAKURI LM - v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-v0.1), [70b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1)) | Llama 2 (**70b**) | 事前学習: mC4, CC100, OSCAR, RedPajama, 独自のデータセット (計 **16B** トークン) SteerLM: OASST2, 独自のデータセット | カラクリ | Llama 2 Community License[^13] |
 - KARAKURI LM 8x7B Instruct v0.1 - instruct-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-8x7b-instruct-v0.1)) | Mixtral-8x7B-Instruct-v0.1 (**46.7b**) | Swallow-MX 8x7B に対して以下のデータセットで学習: Dolly Dataset, OASST2, HelpSteer, glaive-code-assistant-v3, glaive-function-calling-v2, synthetic_text_to_sql, MetaMathQA, orca-math-word-problems-200k, rag-dataset-12000, rag-hallucination-dataset-1000, 独自のデータセット | カラクリ | Apache 2.0 (?)[^12] |
 - ABEJA-Mixtral-8x7B-japanese - v0.1-japanese](https://huggingface.co/abeja/Mixtral-8x7B-v0.1-japanese), [8x7B-Instruct-v0.1-japanese](https://huggingface.co/abeja/Mixtral-8x7B-Instruct-v0.1-japanese), [8x7B-Instruct-v0.1-japanese-alpha](https://huggingface.co/abeja/Mixtral-8x7B-Instruct-v0.1-japanese-alpha), [8x7B-Instruct-v0.1-japanese-alpha-merged](https://huggingface.co/abeja/Mixtral-8x7B-Instruct-v0.1-japanese-alpha-merged)) | Mixtral-8x7B-Instruct-v0.1 (**46.7b**) \*Instructが名前に付いていないモデルのみ Mixtral-8x7B-v0.1 がベース | 事前学習: Japanese CC, Redpajama, 独自 （計 **450B** トークン） | ABEJA | Apache 2.0 |
 - Llama 3 neoAI 8B Chat v0.1 - Chat-v0.1](https://huggingface.co/neoai-inc/Llama-3-neoAI-8B-Chat-v0.1)) | Llama 3 (**8b**) | 不明 | neoAI | Llama 3 Community License |
 - houou-7b - 7b-v1](https://huggingface.co/moneyforward/houou-instruction-7b-v1), [instruction-7b-v2](https://huggingface.co/moneyforward/houou-instruction-7b-v2), [instruction-7b-v3](https://huggingface.co/moneyforward/houou-instruction-7b-v3)) | Llama 2 (**7b**) | Youri 7B (base) に対して Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/) | マネーフォワード | Llama 2 Community License |
 - cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
 - Llama3-Preferred-MedSwallow-70B - Preferred-MedSwallow-70B)) | 医療 | Llama 3 (**70b**) | Preferred Networks | Llama 3 Community License |
 - ELYZA-japanese-Llama-2-7b - japanese-Llama-2-7b), [7b-instruct](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-7b-instruct), [7b-fast](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-7b-fast), [7b-fast-instruct](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-7b-fast-instruct)) | Llama 2 (**7b**) | 事前学習: 日本語 Wikipedia, Japanese OSCAR, その他クロールデータなど (計 **18B** トークン) Instruction Tuning: 独自のデータセット | ELYZA | Llama 2 Community License |
 - ChatNTQ JA 7B - v1.0](https://huggingface.co/NTQAI/chatntq-ja-7b-v1.0)) | Mistral-7B-v0.1 (**7b**) | Japanese Stable LM Gamma 7B (base) に対して独自のデータセットで Instruction Tuning | NTQ Solution | Apache 2.0 |
 - Shisa Gamma 7B - v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1)) | Mistral-7B-v0.1 (**7b**) | Japanese Stable LM Gamma 7B (base) に対して ultra-orca-boros-en-ja で Instruction Tuning | AUGMXNT | Apache 2.0 (?)[^12] |
 - AIBunCho/japanese-novel-gpt-j-6b - J (**6b**) | 個人 ([大曽根宏幸](https://scholar.google.co.jp/citations?user=6ID5K3oAAAAJ)) | CreativeML OpenRAIL-M License |
 - AIgroup-CVM-utokyohospital/MedSwallow-70b - NC-SA 4.0 |
 - nekomata-14b-pfn-qfin - 14b-pfn-qfin), [qfin-inst-merge](https://huggingface.co/pfnet/nekomata-14b-pfn-qfin-inst-merge)) | 金融 | Qwen (**14b**) | Preferred Networks | Tongyi Qianwen LICENSE |
 - LEIA-Swallow-7B - llm/Leia-Swallow-7b)) | Llama 2 (**7b**) | Swallow 7B に対して LEIA で追加学習 | 個人 ([山田育矢](https://scholar.google.com/citations?user=M7YivToAAAAJ), [李凌寒](https://scholar.google.co.jp/citations?user=z9is5FAAAAAJ)) | Llama 2 Community License |
 - turing-motors/Llama-3-heron-brain-70B-v0.3
 - cyberagent/Mistral-Nemo-Japanese-Instruct-2408
 - turing-motors/Llama-3-heron-brain-8B-v0.3
 - Shisa 7B - 7b-v1](https://huggingface.co/augmxnt/shisa-base-7b-v1), [7b-v1](https://huggingface.co/augmxnt/shisa-7b-v1)) | Mistral-7B-v0.1 (**7b**) | 事前学習: shisa-pretrain-en-ja-v1 (**8B** トークン) Instruction Tuning & DPO: ultra-orca-boros-en-ja, shisa-en-ja-dpo-v1 | AUGMXNT | Apache 2.0 (?)[^12] |
 - karasu-1.1B
 - SambaLingo-Japanese - Japanese-Base), [Chat](https://huggingface.co/sambanovasystems/SambaLingo-Japanese-Chat)) | Llama 2 (**7b**) | 事前学習: CulturaX Instruction Tuning: ultrachat_200k DPO: ultrafeedback, cai-conversation-harmless | SambaNova Systems | Llama 2 Community License (?)[^12] |
 - Swallow 7B - hf](https://huggingface.co/tokyotech-llm/Swallow-7b-hf), [7b-instruct-hf](https://huggingface.co/tokyotech-llm/Swallow-7b-instruct-hf), [7b-instruct-v0.1](https://huggingface.co/tokyotech-llm/Swallow-7b-instruct-v0.1), [7b-NVE-hf](https://huggingface.co/tokyotech-llm/Swallow-7b-NVE-hf), [7b-NVE-instruct-hf](https://huggingface.co/tokyotech-llm/Swallow-7b-NVE-instruct-hf), [7b-plus-hf](https://huggingface.co/tokyotech-llm/Swallow-7b-plus-hf)) | Llama 2 (**7b**) | 事前学習: 日本語 Wikipedia, RefinedWeb, Swallow Corpus, The Pile Instruction Tuning: Dolly Dataset, HH RLHF, OASST1 *v0.1モデルでは OASST1, OASST2 を使用 | Swallowプロジェクト | Llama 2 Community License |
 - ELYZA-japanese-Llama-2-13b - japanese-Llama-2-13b), [13b-instruct](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-13b-instruct), [13b-fast](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-13b-fast), [13b-fast-instruct](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-13b-fast-instruct)) | Llama 2 (**13b**) | 事前学習: 日本語 Wikipedia, Japanese OSCAR, その他クロールデータなど (計 **18B** トークン) Instruction Tuning: 独自のデータセット | ELYZA | Llama 2 Community License |
 - Youri 7B - 7b), [7b-instruction](https://huggingface.co/rinna/youri-7b-instruction), [7b-chat](https://huggingface.co/rinna/youri-7b-chat), [7b-gptq](https://huggingface.co/rinna/youri-7b-gptq), [7b-instruction-gptq](https://huggingface.co/rinna/youri-7b-instruction-gptq), [7b-chat-gptq](https://huggingface.co/rinna/youri-7b-chat-gptq)) | Llama 2 (**7b**) | 事前学習: Wikipedia, Japanese C4, Japanese CC-100, Japanese OSCAR, The Pile, 独自のデータセット (計 **40B** トークン) Instruction Tuning: Dolly Dataset, FLAN, llm-japanese-datasetの一部 | rinna | Llama 2 Community License |
 - blue-lizard - lizard](https://huggingface.co/Deepreneur/blue-lizard)) | Llama 2 (**7b**) | 不明 | Deepreneur | Llama 2 Community License |
 - kotomamba-2.8B-CL - 2.8b-slimpj (**2.8b**) | 日本語 Wikipedia, Swallow Corpus, SlimPajama | Kotoba Technologies | Apache 2.0 |
 - Japanese Stable LM 2 1.6B - stablelm-2-base-1_6b), [instruct](https://huggingface.co/stabilityai/japanese-stablelm-2-instruct-1_6b)) | Stable LM 2 1.6B (**1.6b**) | 事前学習: Wikipedia, CulturaX Instruction Tuning: jaster, [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), alpaca-gpt4-japanese, ultra-orca-boros-en-ja-v1 | Stability AI | STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE |
 - ELYZA-japanese-CodeLlama-7b - japanese-CodeLlama-7b), [7b-instruct](https://huggingface.co/elyza/ELYZA-japanese-CodeLlama-7b-instruct)) | コーディング | Code Llama (**7b**) | ELYZA | Llama 2 Community License |
 - Swallow-MS 7B - v0.1](https://huggingface.co/tokyotech-llm/Swallow-MS-7b-v0.1), [7b-instruct-v0.1](https://huggingface.co/tokyotech-llm/Swallow-MS-7b-instruct-v0.1)) | Mistral-7B-v0.1 (**7b**) | 事前学習: Algebraic Stack, Japanese Wikipedia, RefinedWeb, Swallow Corpus, The Pile Instruction Tuning: Dolly Dataset, OASST1 | Swallowプロジェクト | Apache 2.0 |
 - Gemma 2 Baku 2B - 2-baku-2b), [2b-it](https://huggingface.co/rinna/gemma-2-baku-2b-it)) | Gemma 2 (**2b**) | 事前学習: Wikipedia, Japanese C4, Japanese CC-100, Japanese OSCAR, The Pile, 独自のデータセット (計 **80B** トークン) OPRO: 独自のデータセット [^20] | rinna | Gemma Terms of Use |
 - Watashiha-Llama-2-13B-Ogiri-sft - Llama-2-13B-Ogiri-sft), [sft-neuron](https://huggingface.co/watashiha/Watashiha-Llama-2-13B-Ogiri-sft-neuron)) | 大喜利 | Llama 2 (**13b**) | わたしは | Llama 2 Community License |
 - Llama 3 tedllm - electron-device-ai/llama3-tedllm-8b-v0)) | Llama 3 (**8b**) | 事前学習: 日本語の一般コーパス | 東京エレクトロンデバイス | Llama 3 Community License |
 - Llama 3 ELYZA JP 8B - 3-ELYZA-JP-8B), [8B-GGUF](https://huggingface.co/elyza/Llama-3-ELYZA-JP-8B-GGUF), [8B-AWQ](https://huggingface.co/elyza/Llama-3-ELYZA-JP-8B-AWQ)) | Llama 3 (**8b**) | 不明 | ELYZA | Llama 3 Community License |
 - Japanese Stable LM Beta 7B - beta-7b](https://huggingface.co/stabilityai/japanese-stablelm-base-beta-7b), [base-ja_vocab-beta-7b](https://huggingface.co/stabilityai/japanese-stablelm-base-ja_vocab-beta-7b), [instruct-beta-7b](https://huggingface.co/stabilityai/japanese-stablelm-instruct-beta-7b), [instruct-ja_vocab-beta-7b](https://huggingface.co/stabilityai/japanese-stablelm-instruct-ja_vocab-beta-7b)) | Llama 2 (**7b**) | 事前学習: Wikipedia, Japanese mC4, Japanese CC-100, Japanese OSCAR, SlimPajama(Books3を除外) (計 **100B** トークン) Instruction Tuning: Dolly Dataset, HH RLHF, OASST1 | Stability AI | Llama 2 Community License |
 - Nekomata 7B - 7b), [7b-instruction](https://huggingface.co/rinna/nekomata-7b-instruction), [7b-gguf](https://huggingface.co/rinna/nekomata-7b-gguf), [7b-instruction-gguf](https://huggingface.co/rinna/nekomata-7b-instruction-gguf)) | Qwen (**7b**) | 事前学習: Wikipedia, Japanese C4, Japanese CC-100, Japanese OSCAR, The Pile, 独自のデータセット (計 **66B** トークン) Instruction Tuning: Dolly Dataset, FLAN, llm-japanese-datasetの一部 | rinna | Tongyi Qianwen LICENSE |
 - KARAKURI LM 8x7B Chat v0.1 - chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-8x7b-chat-v0.1)) | Mixtral-8x7B-Instruct-v0.1 (**46.7b**) | Swallow-MX 8x7B に対して SteerLM: OASST2, HelpSteer, 独自のデータセット | カラクリ | Apache 2.0 |
 - Llama 3 Youko 8B - 3-youko-8b), [8b-instruct](https://huggingface.co/rinna/llama-3-youko-8b-instruct), [8b-gptq](https://huggingface.co/rinna/llama-3-youko-8b-gptq), [8b-instruct-gptq](https://huggingface.co/rinna/llama-3-youko-8b-instruct-gptq)) | Llama 3 (**8b**) | 事前学習: Wikipedia, Japanese C4, Japanese CC-100, Japanese OSCAR, The Pile, 独自のデータセット (計 **22B** トークン) Instruction Tuning[^11]: Aya Dataset (Japanese subset), FLAN, Dolly Dataset, HH RLHF, OASST1, OASST2, MetaMathQA, CodeAlpaca Dataset, 独自のデータセット DPO: HelpSteer, HelpSteer2, 独自のデータセット | rinna | Llama 3 Community License |
 - Llama 3 Swallow 8B - v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-8B-v0.1), [8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1)) | Llama 3 (**8b**) | 事前学習: Algebraic Stack, Wikipedia, RefinedWeb, Swallow Corpus, Cosmopedia, Laboro ParaCorpus, OpenWebMath Instruction Tuning: OASST1 [^17] | Swallowプロジェクト | Llama 3 Community License |
 - Llama 3.1 Swallow 8B - v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1), [8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1), [8B-v0.2](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.2), [8B-Instruct-v0.2](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.2), [8B-Instruct-v0.3](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3)) | Llama 3.1 (**8b**) | 事前学習: The Stack v2, Wikipedia, DCLM-baseline-1.0, Swallow Corpus Version 2, Cosmopedia, Laboro ParaCorpus Instruction Tuning: lmsys-chat-1m-synth-ja-wo-pii-and-template-instructions, lmsys-chat-1m-synth-en-wo-pii-and-template-instructions, filtered-magpie-ultra-ja, filtered-magpie-ultra-en, gemma-magpie | Swallowプロジェクト | Llama 3.1 Community License (Instructモデルは Gemma Terms of Use も適用) |
 - Japanese Stable LM 3B-4E1T - 4e1t-base](https://huggingface.co/stabilityai/japanese-stablelm-3b-4e1t-base), [3b-4e1t-instruct](https://huggingface.co/stabilityai/japanese-stablelm-3b-4e1t-instruct)) | StableLM-3B-4E1T (**3b**) | 事前学習: Wikipedia, Japanese mC4, Japanese CC-100, Japanese OSCAR, SlimPajama(Books3を除外) (計 **100B** トークン) Instruction Tuning: Dolly Dataset, HH RLHF, llm-japanese-datasetのwikinews subset | Stability AI | Apache 2.0 |
- 海外モデルに日本語で事後学習のみ行ったモデル
 - AIgroup-CVM-utokyohospital/Llama-2-70b-chat-4bit-japanese
 - NTQAI/chatntq-7b-jpntuned - 4 World (**7b**) || NTQ Solution | ？ |
 - AXCXEPT/Llama-3.1-70B-EZO-1.1-it
 - AXCXEPT/EZO-Common-9B-gemma-2-it
 - AXCXEPT/EZO-Humanities-9B-gemma-2-it
 - AXCXEPT/Llama-3.1-8B-EZO-1.1-it
 - AXCXEPT/Llama-3-EZO-8b-Common-it
 - AXCXEPT/EZO-Common-T2-2B-gemma-2-it
 - doshisha-mil/llama-2-70b-chat-4bit-japanese-v1
 - Sparticle/llama-2-13b-chat-japanese-lora
 - izumi-lab/llama-13b-japanese-lora-v0-1ep
 - Llama 3 Suzume 8B - japanese](https://huggingface.co/lightblue/suzume-llama-3-8B-japanese), [8B-japanese-gguf](https://huggingface.co/lightblue/suzume-llama-3-8B-japanese-gguf)) | Llama 3 (**8b**) | megagonlabs/instruction_ja, ShareGPT, 独自のデータセット | Lightblue | Llama 3 Community License (?)[^12] |
 - ganchengguang/Yoko-7B-Japanese-v1
 - Sparticle/llama-2-7b-chat-japanese-lora
 - izumi-lab/llama-7b-japanese-lora-v0-5ep
 - lightblue/jod - 7B-SlimOrca (**7b**) || Lightblue | Apache 2.0 |
 - ao-Karasu - karasu-72B)) | Qwen1.5 (**72b**) | ultra-orca-boros-en-ja-v1, OASST1, ShareGPT, 日本語の公開技術ブログ, ニュース記事, QAサイトの回答, 独自のデータセット | Lightblue | Tongyi Qianwen LICENSE (?)[^12] |
 - Borea - Phi-3.5-mini-Instruct-Jp), [Common](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Common), [Coding](https://huggingface.co/AXCXEPT/Borea-Phi-3.5-mini-Instruct-Coding)) | Phi-3.5 (**3.8b**) | | Axcxept | MIT |
 - 日本語版 Gemma 2 2B - jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it)) | Gemma 2 (**2b**) || Google | Gemma Terms of Use |
 - JMedLoRA - jmedlora-6.89ep](https://huggingface.co/AIgroup-CVM-utokyohospital/llama2-jmedlora-6.89ep)) | 医療 | Llama 2 (**70b**) | 東京大学医学部附属病院循環器内科 AIグループ | CC BY-NC 4.0 |
 - AXCXEPT/EZO-Qwen2.5-72B-Instruct - AutoCoTRAG-Qwen2.5-72B-Instruct_q4](https://huggingface.co/AXCXEPT/EZO-AutoCoTRAG-Qwen2.5-72B-Instruct_q4) | Qwen2.5 (**72b**) || Axcxept | Qwen License |
 - AXCXEPT/EZO-Qwen2.5-32B-Instruct - AutoCoTRAG-Qwen2.5-32B-Instruct](https://huggingface.co/AXCXEPT/EZO-AutoCoTRAG-Qwen2.5-32B-Instruct) | Qwen2.5 (**32b**) || Axcxept | Apache 2.0 |
 - AXCXEPT/EZO-Llama-3.2-3B-Instruct-dpoE
 - AXCXEPT/EZO-gemma-2-2b-jpn-it
 - Llama 3 shisa-v1-llama3-70b - ai/shisa-v1-llama3-70b)) | Llama 3 (**70b**) | ultra-orca-boros-en-ja-v1 | Shisa.AI | Llama 3 Community License (?)[^12] |
 - Llama 3 shisa-v1-llama3-8b - ai/shisa-v1-llama3-8b)) | Llama 3 (**8b**) | ultra-orca-boros-en-ja-v1 | Shisa.AI | Llama 3 Community License (?)[^12] |
 - Qarasu - chat-plus-unleashed](https://huggingface.co/lightblue/qarasu-14B-chat-plus-unleashed)) | Qwen (**14b**) | ultra-orca-boros-en-ja-v1, OASST1, ShareGPT, 独自のデータセット | Lightblue | Tongyi Qianwen LICENSE (?)[^12] |
- フルスクラッチ学習モデル
 - LLM-jp-13B - v1.0](https://huggingface.co/llm-jp/llm-jp-1.3b-v1.0), [**13b**-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-v1.0), [**13b**-instruct-full-jaster-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-jaster-v1.0), [**13b**-instruct-full-jaster-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0), [**13b**-instruct-full-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-dolly-oasst-v1.0), [**13b**-instruct-lora-jaster-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-jaster-v1.0), [**13b**-instruct-lora-jaster-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-jaster-dolly-oasst-v1.0), [**13b**-instruct-lora-dolly-oasst-v1.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-dolly-oasst-v1.0)) | 2,048 | 事前学習: [llm-jp-corpus](https://github.com/llm-jp/llm-jp-corpus) (Wikipedia, Japanese mC4, The Pile, Stack) (計 **300B** トークン) Instruction Tuning (Full-parameter FT or LoRA): jaster, Dolly Dataset, OASST1 | LLM-jp | Apache 2.0 |
 - Weblab-10B - NeoX ([**10b**](https://huggingface.co/matsuo-lab/weblab-10b), [**10b**-instruction-sft](https://huggingface.co/matsuo-lab/weblab-10b-instruction-sft)) | 2,048 | Japanese mC4 + The Pile（計 **600B** トークン） \*instruction-sft モデルは Alpaca Dataset, FLAN でファインチューニング | 東大松尾研 | CC BY-NC 4.0 |
 - OpenCALM - NeoX ([small](https://huggingface.co/cyberagent/open-calm-small), [medium](https://huggingface.co/cyberagent/open-calm-medium), [large](https://huggingface.co/cyberagent/open-calm-large), [**1b(1.4b)**](https://huggingface.co/cyberagent/open-calm-1b), [**3b(2.7b)**](https://huggingface.co/cyberagent/open-calm-3b), [**7b(6.8b)**](https://huggingface.co/cyberagent/open-calm-7b)) | 2,048 | 日本語 Wikipedia + Jpanese mC4 + Japanese CC-100 | サイバーエージェント | CC BY-SA 4.0 |
 - rinna GPT (英語やコードも含めて学習されたモデル) - NeoX ([**4b(3.8b)**](https://huggingface.co/rinna/bilingual-gpt-neox-4b), [**4b(3.8b)**-8k](https://huggingface.co/rinna/bilingual-gpt-neox-4b-8k), [**4b(3.8b)**-instruction-sft](https://huggingface.co/rinna/bilingual-gpt-neox-4b-instruction-sft), [**4b(3.8b)**-instruction-ppo](https://huggingface.co/rinna/bilingual-gpt-neox-4b-instruction-ppo)) | 8kモデル: 8,192 他: 2,048 | Wikipedia, Japanese CC-100, Japanese C4, RedPajama, The Pile (計 **524B** トークン) \*8k モデルでは 4,000トークンを超える長いトークン列でファインチューニング \*instruction-sft モデルでは HH RLHF、FLAN でファインチューニング \*instruction-ppo モデルでは HH RLHF で PPO ベースの強化学習 | rinna | MIT |
 - rinna GPT (日本語のみで学習されたモデル) - NeoX ([xsmall](https://huggingface.co/rinna/japanese-gpt2-xsmall), [small](https://huggingface.co/rinna/japanese-gpt2-small), [medium](https://huggingface.co/rinna/japanese-gpt2-medium), [**1b**](https://huggingface.co/rinna/japanese-gpt-1b), [neox-small](https://huggingface.co/rinna/japanese-gpt-neox-small), [neox-**3.6b**](https://huggingface.co/rinna/japanese-gpt-neox-3.6b), [neox-**3.6b**-instruction-sft](https://huggingface.co/rinna/japanese-gpt-neox-3.6b-instruction-sft), [neox-**3.6b**-instruction-sft-v2](https://huggingface.co/rinna/japanese-gpt-neox-3.6b-instruction-sft-v2), [neox-**3.6b**-instruction-ppo](https://huggingface.co/rinna/japanese-gpt-neox-3.6b-instruction-ppo)) | ≤ 2,048 | 日本語 Wikipedia + Japanese CC-100 (1b 以降のモデルでは さらに Japanese mC4 を追加) \*instruction-sft, sft-v2 モデルでは HH RLHF、FLAN、SHP データセットでさらにファインチューニング \*instruction-ppo モデルでは HH RLHF でさらに PPO ベースの強化学習 | rinna | MIT |
 - 早大GPT - waseda/gpt2-small-japanese), [**xl(1.5b)**](https://huggingface.co/nlp-waseda/gpt2-xl-japanese)) | | 日本語 Wikipedia + Japanese CC-100 | 早大河原研 | CC BY-SA 4.0 |
 - 京大GPT - nlp/gpt2-small-japanese-char), [medium (文字レベル)](https://huggingface.co/ku-nlp/gpt2-medium-japanese-char), [large (文字レベル)](https://huggingface.co/ku-nlp/gpt2-large-japanese-char)) | | 日本語 Wikipedia (約2,700万文 (3.2GB)) + Japanese CC-100 (約6億1,900万文 (85GB)) + Japanese OSCAR (約3億2,600万文 (54GB)) | 京大言語メディア研究室 | CC BY-SA 4.0 |
 - 日本語BART - nlp/bart-base-japanese), [large](https://huggingface.co/ku-nlp/bart-large-japanese)) | | 日本語 Wikipedia (約1,800万文) | 京大言語メディア研究室 | CC BY-SA 4.0 |
 - Megagon Labs T5 - base-japanese-web)) | | Japanese mC4 (87,425,304 ページ (782 GB)) + Japanese wiki40b (828,236 記事 (2 GB)) | Megagon Labs (リクルート) | Apache 2.0 |
 - AcademicBART
 - Tanuki-8B - GENIAC/Tanuki-8B-dpo-v1.0), [v1.0-AWQ](https://huggingface.co/team-hatakeyama-phase2/Tanuki-8B-dpo-v1.0-AWQ), [v1.0-GPTQ-4bit](https://huggingface.co/team-hatakeyama-phase2/Tanuki-8B-dpo-v1.0-GPTQ-4bit), [v1.0-GPTQ-8bit](https://huggingface.co/team-hatakeyama-phase2/Tanuki-8B-dpo-v1.0-GPTQ-8bit), [v1.0-GGUF](https://huggingface.co/team-hatakeyama-phase2/Tanuki-8B-dpo-v1.0-GGUF)) | 4,096 | 事前学習: 様々な Web 上のデータ, 合成データ（計 **1.3T** トークン） SFT, DPO: 様々な合成データ [^19] | 松尾研LLM開発プロジェクト | Apache 2.0 |
 - Spiral-RetNet-3b-base - AI/Spiral-RetNet-3b-base)) | 2,048 | Wikipedia, Japanese CC-100, CulturaX | Spiral.AI | MIT |
 - kotomamba-2.8B - v1.0](https://huggingface.co/kotoba-tech/kotomamba-2.8B-v1.0)) | 2,048 | 日本語 Wikipedia, Swallow Corpus, SlimPajama | Kotoba Technologies | Apache 2.0 |
 - Stormy - NeoX ([**7b(6.8b)**](https://huggingface.co/izumi-lab/stormy-7b-10ep)) | 2,048 | OpenCALM (6.8b) に対して llm-japanese-dataset v0 のうち翻訳タスクを除いたデータで LoRAチューニング | 東大和泉研 | CC BY-SA 4.0 |
 - japanese-large-lm - NeoX ([**1.7b**](https://huggingface.co/line-corporation/japanese-large-lm-1.7b), [**3.6b**](https://huggingface.co/line-corporation/japanese-large-lm-3.6b), [**1.7b**-instruction-sft](https://huggingface.co/line-corporation/japanese-large-lm-1.7b-instruction-sft), [**3.6b**-instruction-sft](https://huggingface.co/line-corporation/japanese-large-lm-3.6b-instruction-sft)) | 2,048 | 日本語 Wikipedia, Japanese CC-100, Japanese C4, Japanese OSCAR や独自データなど (計 **650GB**) \*instruction-sft モデルでは OASST1 でファインチューニング | LINE | Apache 2.0 |
 - ABEJA GPT - NeoX ([large](https://huggingface.co/abeja/gpt2-large-japanese), [neox-**2.7b**](https://huggingface.co/abeja/gpt-neox-japanese-2.7b)) | | 日本語 Wikipedia + Japanese CC-100 + Japanese OSCAR | ABEJA | MIT |
 - ストックマークGPT - NeoX ([**1.4b**](https://huggingface.co/stockmark/gpt-neox-japanese-1.4b)) | | 日本語 Wikipedia (0.88B トークン) + Japanese CC-100 (10.5B トークン) + 独自のWebデータ (8.6B トークン) | ストックマーク | MIT |
 - イエローバックGPT - NeoX ([**1.3b**](https://huggingface.co/yellowback/gpt-neo-japanese-1.3B)) | | 日本語 Wikipedia + Japanese CC-100 + Japanese OSCAR | イエローバック | Apache 2.0 |
 - PLaMo-100B-Pretrained - 100b)) | 4,096 | 事前学習: Japanese CommonCrawl, RefinedWeb, 独自のデータセット (計: **2.0T** トークン) | Preferred Elements (Preferred Networks) | PLaMo Non-Commercial License |
 - PLaMo-13B - 13b), [**13b**-instruct](https://huggingface.co/pfnet/plamo-13b-instruct), [**13b**-instruct-nc](https://huggingface.co/pfnet/plamo-13b-instruct-nc)) | base: 4,096 instruct, instruct-nc: 8,192 | 事前学習: C4, Project Gutenberg, RedPajama, 日本語 Wikipedia, Japanese mC4 (計 **1.5T** トークン) Instruction Tuning: Dolly Dataset, HH RLHF, OASST1, llm-japanese-datasetのwikinews subset (NCモデルでは商用利用不可の Alpaca Dataset も含めて学習) | Preferred Networks | Apache 2.0 (NC モデルは CC BY-NC 4.0) |
 - Sarashina2-8x70B - 8x70b)) | 8,192 | Sarashina2 (70B) に対して Sparse Upcycling で学習 | SB Intuitions | Sarashina Model NonCommercial License |
 - Sarashina1 - NeoX ([**7b**](https://huggingface.co/sbintuitions/sarashina1-7b), [**13b**](https://huggingface.co/sbintuitions/sarashina1-13b), [**65b**](https://huggingface.co/sbintuitions/sarashina1-65b)) | 2,048 | 事前学習: Japanese Common Crawl (計 **1T** トークン) | SB Intuitions | MIT |
 - Fugaku-LLM - LLM/Fugaku-LLM-13B), [**13B**-instruct](https://huggingface.co/Fugaku-LLM/Fugaku-LLM-13B-instruct), [**13B**-instruct-gguf](https://huggingface.co/Fugaku-LLM/Fugaku-LLM-13B-instruct-gguf)) | 2,048 | 事前学習: 独自 Instruction Tuning: OASST1, Dolly Dataset, GSM8K | 東工大, 東北大, 富士通, 理研, 名大, サイバーエージェント, Kotoba Technologies | Fugaku-LLM Terms of Use |
 - LLM-jp-13B v2.0 - v2.0](https://huggingface.co/llm-jp/llm-jp-13b-v2.0), [**13b**-instruct-full-dolly-ichikara_004_001_single-oasst-oasst2-v2.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-dolly-ichikara_004_001_single-oasst-oasst2-v2.0), [**13b**-instruct-full-ac_001-dolly-ichikara_004_001_single-oasst-oasst2-v2.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-ac_001-dolly-ichikara_004_001_single-oasst-oasst2-v2.0), [**13b**-instruct-full-ac_001_16x-dolly-ichikara_004_001_single-oasst-oasst2-v2.0](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-ac_001_16x-dolly-ichikara_004_001_single-oasst-oasst2-v2.0)) | 4,096 | 事前学習: [llm-jp-corpus-v2](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v2) (計 **260B** トークン) Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), [answer-carefully](https://liat-aip.sakura.ne.jp/wp/answercarefully-dataset/), Dolly Dataset, OASST1, OASST2 | LLM-jp | Apache 2.0 |
 - LLM-jp-3 172B - jp/llm-jp-3-172b), [**172b**-instruct3](https://huggingface.co/llm-jp/llm-jp-3-172b-instruct3)) | 4,096 | 事前学習: [llm-jp-corpus-v3](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v3) (計 **2.1T** トークン) Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), [answer-carefully](https://liat-aip.sakura.ne.jp/wp/answercarefully-dataset/), [magpie-sft-v1.0](https://huggingface.co/datasets/llm-jp/magpie-sft-v1.0), Daring-Anteater, FLAN, ichikara-instruction-format, AutoMultiTurnByCalm3-22B, ramdom-to-fixed-multiturn-Calm3, wizardlm8x22b-logical-math-coding-sft-ja, wizardlm8x22b-logical-math-coding-sft_additional-ja, Synthetic-JP-EN-Coding-Dataset-567k DPO: 合成データ | 大規模言語モデル研究開発センター | 事前学習済みモデル: LLM-jp-3 172B Terms of Use 事後学習済みモデル: llm-jp-3-172b-instruct3利用許諾契約 |
 - LLM-jp-3 172B beta2 - beta2](https://huggingface.co/llm-jp/llm-jp-3-172b-beta2), [**172b**-beta2-instruct2](https://huggingface.co/llm-jp/llm-jp-3-172b-beta2-instruct2)) | 4,096 | 事前学習: [llm-jp-corpus-v3](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v3)の一部 (計 **1.4T** トークン) Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), [answer-carefully](https://liat-aip.sakura.ne.jp/wp/answercarefully-dataset/), [magpie-sft-v1.0](https://huggingface.co/datasets/llm-jp/magpie-sft-v1.0), Daring-Anteater, FLAN, ichikara-instruction-format, AutoMultiTurnByCalm3-22B, ramdom-to-fixed-multiturn-Calm3, wizardlm8x22b-logical-math-coding-sft-ja, wizardlm8x22b-logical-math-coding-sft_additional-ja, Synthetic-JP-EN-Coding-Dataset-567k | 大規模言語モデル研究開発センター | LLM-jp-3 172B beta2 Terms of Use |
 - LLM-jp-3 172B beta1 - beta1](https://huggingface.co/llm-jp/llm-jp-3-172b-beta1), [**172b**-beta1-instruct](https://huggingface.co/llm-jp/llm-jp-3-172b-beta1-instruct)) | 4,096 | 事前学習: [llm-jp-corpus-v3](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v3)の一部 (計 **0.7T** トークン) Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), [answer-carefully](https://liat-aip.sakura.ne.jp/wp/answercarefully-dataset/), Dolly Dataset, OASST1, OASST2, Aya Dataset, ichikara-instruction-format, Daring-Anteater, FLAN | 大規模言語モデル研究開発センター | LLM-jp-3 172B beta1 Terms of Use |
 - LLM-jp-3 172B alpha - alpha1](https://huggingface.co/llm-jp/llm-jp-3-172b-alpha1), [**172b**-alpha1-instruct](https://huggingface.co/llm-jp/llm-jp-3-172b-alpha1-instruct), [**172b**-alpha2](https://huggingface.co/llm-jp/llm-jp-3-172b-alpha2), [**172b**-alpha2-instruct](https://huggingface.co/llm-jp/llm-jp-3-172b-alpha2-instruct)) | 4,096 | 事前学習: [llm-jp-corpus-v3](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v3)の一部 (alpha1: 計 **0.7T** トークン, alpha2: 計 **1.4T** トークン) Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), [answer-carefully](https://liat-aip.sakura.ne.jp/wp/answercarefully-dataset/), Dolly Dataset, OASST1, OASST2, Aya Dataset, ichikara-instruction-format, Daring-Anteater, FLAN | 大規模言語モデル研究開発センター | Apache 2.0 |
 - LLM-jp-3 13B - jp/llm-jp-3-1.8b), [**1.8b**-instruct](https://huggingface.co/llm-jp/llm-jp-3-1.8b-instruct), [**3.7b**](https://huggingface.co/llm-jp/llm-jp-3-3.7b), [**3.7b**-instruct](https://huggingface.co/llm-jp/llm-jp-3-3.7b-instruct), [**13b**](https://huggingface.co/llm-jp/llm-jp-3-13b), [**13b**-instruct](https://huggingface.co/llm-jp/llm-jp-3-13b-instruct)) | 4,096 | 事前学習: [llm-jp-corpus-v3](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v3) (計 **2.1T** トークン) Instruction Tuning: [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/), [answer-carefully](https://liat-aip.sakura.ne.jp/wp/answercarefully-dataset/), FLAN, ichikara-instruction-format, AutoMultiTurnByCalm3-22B, ramdom-to-fixed-multiturn-Calm3, wizardlm8x22b-logical-math-coding-sft_additional-ja, Synthetic-JP-EN-Coding-Dataset-567k | 大規模言語モデル研究開発センター | Apache 2.0 |
 - Sarashina2.1-1B - 1b)) | 8,192 | Web 上などの日本語・英語データ（計 **10T** トークン） | SB Intuitions | Sarashina Model NonCommercial License |
 - llm-jp-3-3.7b-instruct-EZO - instruct-EZO-Common](https://huggingface.co/AXCXEPT/llm-jp-3-3.7b-instruct-EZO-Common), [**3.7b**-instruct-EZO-Humanities](https://huggingface.co/AXCXEPT/llm-jp-3-3.7b-instruct-EZO-Humanities)) | 4,096 | LLM-jp-3 (3.7B) に対して追加学習 | Axcxept | Apache 2.0 |
 - Stockmark-100b - 100b), [**100b**-instruct-v0.1](https://huggingface.co/stockmark/stockmark-100b-instruct-v0.1)) | 4,096 | 事前学習: RedPajama, 日本語 Wikipedia, Japanese mC4, Japanese CommonCrawl, 日本語特許, Stockmark Web Corpus (計 **910B** トークン) Instruction Tuning (LoRA): [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/) | ストックマーク | MIT |
 - LLM-jp-13B v1.1 - instruct-lora-dolly_en-dolly_ja-ichikara_003_001-oasst_en-oasst_ja-v1.1](https://huggingface.co/llm-jp/llm-jp-13b-instruct-lora-dolly_en-dolly_ja-ichikara_003_001-oasst_en-oasst_ja-v1.1), [**13b**-instruct-full-dolly_en-dolly_ja-ichikara_003_001-oasst_en-oasst_ja-v1.1](https://huggingface.co/llm-jp/llm-jp-13b-instruct-full-dolly_en-dolly_ja-ichikara_003_001-oasst_en-oasst_ja-v1.1), [**13b**-dpo-lora-hh_rlhf_ja-v1.1](https://huggingface.co/llm-jp/llm-jp-13b-dpo-lora-hh_rlhf_ja-v1.1)) | 2,048 | Instruction Tuning (LoRA or Full-parameter FT): Dolly Dataset, OASST1, [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/) DPO (LoRA): HH RLHF | LLM-jp | Apache 2.0 |
 - Stockmark-13b - 13b), [**13b**-instruct](https://huggingface.co/stockmark/stockmark-13b-instruct)) | 2,048 | 事前学習: 日本語 Wikipedia、Japanese CC-100、Japanese mC4、Japanese CommonCrawl、日本語特許、Stockmark Web Corpus (計 **220B** トークン) Instruction Tuning (LoRA): [ichikara-instruction](https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/) | ストックマーク | baseモデル: MIT instructモデル: CC BY-NC-SA 4.0 |
 - Japanese StableLM Alpha - NeoX ([base-alpha-**7b**](https://huggingface.co/stabilityai/japanese-stablelm-base-alpha-7b), [instruct-alpha-**7b**](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b), [instruct-alpha-**7b**-v2](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b-v2)) | 2,048 | Wikipedia, Japanese CC-100, Japanese mC4, Japanese OSCAR, RedPajama (+ 独自のデータセット)[^2] (計 **750B** トークン) \*instruct モデルでは Alpaca Dataset, Dolly Dataset, HH RLHF, llm-japanese-datasetのwikinews subsetでファインチューニング (v2では商用利用不可の Alpaca Dataset を除外) | Stability AI | baseモデル: Apache 2.0 instruct モデル (v1): [独自のライセンス](https://huggingface.co/stabilityai/japanese-stablelm-instruct-alpha-7b/tree/main) instruct モデル (v2): Apache 2.0 |
 - レトリバT5 - jp/t5-small-short), [small (medium)](https://huggingface.co/retrieva-jp/t5-small-medium), [small (long)](https://huggingface.co/retrieva-jp/t5-small-long), [base (short)](https://huggingface.co/retrieva-jp/t5-base-short), [base (medium)](https://huggingface.co/retrieva-jp/t5-base-medium), [base (long)](https://huggingface.co/retrieva-jp/t5-base-long), [large (short)](https://huggingface.co/retrieva-jp/t5-large-short), [large (medium)](https://huggingface.co/retrieva-jp/t5-large-medium), [large (long)](https://huggingface.co/retrieva-jp/t5-large-long), [**xl(3b)**](https://huggingface.co/retrieva-jp/t5-xl)) | | 日本語 Wikipedia + Japanese mC4 | レトリバ | CC BY-SA 4.0 |
 - colorfulscoop GPT - small-ja)) | | 日本語 Wikipedia | Colorful Scoop | CC BY-SA 3.0 |
 - 東工大GPT - lab/japanese-gpt2-medium-unidic), [medium (逆方向)](https://huggingface.co/okazaki-lab/japanese-reversed-gpt2-medium-unidic)) [^3] | | 日本語 Wikipedia + Japanese CC-100 | 東工大岡崎研 | CC BY-SA 4.0 |
 - 日本語対話Transformer - dialog-transformers/blob/main/LICENSE.md) |
 - CyberAgentLM3 (CALM3) - chat](https://huggingface.co/cyberagent/calm3-22b-chat)) | **16,384** | 不明 (計 **2.0T** トークン) | サイバーエージェント | Apache 2.0 |
 - CyberAgentLM2 (CALM2) - 7b), [**7b**-chat](https://huggingface.co/cyberagent/calm2-7b-chat), [**7b**-chat-dpo-experimental](https://huggingface.co/cyberagent/calm2-7b-chat-dpo-experimental)) | base: 4,096 chat: **32,768** |一般公開されている日本語・英語のデータセット（詳細不明） (計 **1.3T** トークン) *dpo モデルは Chatbot Arena Conversations JA (calm2) Dataset を用いて DPO で学習 | サイバーエージェント | Apache 2.0 (dpo モデルのみ CC BY 4.0) |
- APIとして提供されているモデル
 - tsuzumi - 7b](https://learn.microsoft.com/ja-jp/azure/ai-studio/how-to/deploy-models-tsuzumi)) | | NTT | Azure AI Foundry |
 - LHTM-OPT
 - Solar mini chat ja - 1-mini-chat-ja](https://developers.upstage.ai/docs/apis/chat)) | 32,768 | Upstage | 独自 |
 - AIのべりすと
 - LHTM-OPT
 - Solar mini chat ja - 1-mini-chat-ja](https://developers.upstage.ai/docs/apis/chat)) | 32,768 | Upstage | 独自 |
- 複数のLLMをマージして作成されたモデル
 - EQUES/MedLLama3-JP-v2 - 8B, MMed-Llama 3 8B, **Llama 3 ELYZA JP 8B** | EQUES | Llama 3 Community License |
 - EvoLLM-JP - 7B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-7B), [v1-10B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B)) | **Shisa Gamma 7B (v1)**, WizardMath-7B-V1.1, Abel 7B 002 | Sakana AI | MICROSOFT RESEARCH LICENSE |
- 海外モデルに日本語で追加事前学習を行ったモデル（継続事前学習モデル）
 - KARAKURI LM 8x7B - chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-8x7b-chat-v0.1)) | Mixtral-8x7B-Instruct-v0.1 (**46.7b**) | Swallow-MX 8x7B に対して SteerLM: OASST2, HelpSteer, 独自のデータセット | カラクリ | Apache 2.0 |
 - Llama 3 Youko 8B - 3-youko-8b)) | Llama 3 (**8b**) | 事前学習: Wikipedia, Japanese C4, Japanese CC-100, Japanese OSCAR, The Pile, 独自のデータセット (計 **22B** トークン) | rinna | Llama 3 Community License |
 - Swallow-MS 7B - v0.1](https://huggingface.co/tokyotech-llm/Swallow-MS-7b-v0.1), [7b-instruct-v0.1](https://huggingface.co/tokyotech-llm/Swallow-MS-7b-instruct-v0.1)) | Mistral-7B-v0.1 (**7b**) | 事前学習: Algebraic Stack, Japanese Wikipedia, RefinedWeb, Swallow Corpus, The Pile Instruction Tuning: Dolly Dataset, OASST1 | TokyoTech-LLM | Apache 2.0 |
 - Swallow 7B - hf](https://huggingface.co/tokyotech-llm/Swallow-7b-hf), [7b-instruct-hf](https://huggingface.co/tokyotech-llm/Swallow-7b-instruct-hf), [7b-instruct-v0.1](https://huggingface.co/tokyotech-llm/Swallow-7b-instruct-v0.1), [7b-NVE-hf](https://huggingface.co/tokyotech-llm/Swallow-7b-NVE-hf), [7b-NVE-instruct-hf](https://huggingface.co/tokyotech-llm/Swallow-7b-NVE-instruct-hf), [7b-plus-hf](https://huggingface.co/tokyotech-llm/Swallow-7b-plus-hf)) | Llama 2 (**7b**) | 事前学習: 日本語 Wikipedia, RefinedWeb, Swallow Corpus, The Pile Instruction Tuning: Dolly Dataset, HH RLHF, OASST1 *v0.1モデルでは OASST1, OASST2 を使用 | TokyoTech-LLM | Llama 2 Community License |
- 英語モデルに日本語で追加事前学習を行ったモデル（継続事前学習モデル）
 - RakutenAI-7B - 7B), [7B-instruct](https://huggingface.co/Rakuten/RakutenAI-7B-instruct), [7B-chat](https://huggingface.co/Rakuten/RakutenAI-7B-chat)) | Mistral-7B-v0.1 (**7b**) | 事前学習: 不明 Instruction Tuning: Dolly Dataset, OASST1, （jasterと同様に）言語理解データセットの訓練データを Instruction Tuning 用に変換したもの, 独自のデータセット | 楽天 | Apache 2.0 |
- 海外モデルに日本語で指示チューニング (Instruction Tuning) のみ行ったモデル
 - JMedLoRA - jmedlora-6.89ep](https://huggingface.co/AIgroup-CVM-utokyohospital/llama2-jmedlora-6.89ep)) | 医療 | Llama 2 (**70b**) | 東京大学医学部附属病院循環器内科 AIグループ | CC BY-NC 4.0 |
入力テキストの処理に主に使うモデル
- 汎用
 - シナモンELECTRA - small-japanese-discriminator) |
 - GLOBIS DeBERTaV3 - 100, Japanese mC4, Japanese OSCAR | グロービス | CC BY-SA 4.0 | ◯ ([xsmall](https://huggingface.co/globis-university/deberta-v3-japanese-xsmall), [base](https://huggingface.co/globis-university/deberta-v3-japanese-base), [large](https://huggingface.co/globis-university/deberta-v3-japanese-large)) |
 - Laboro BERT - NC 4.0 | ✕ |
 - Laboro DistilBERT - （Laboro BERT(base) を親モデルとして知識蒸留）| Laboro.AI | CC BY-NC 4.0 | [◯](https://huggingface.co/laboro-ai/distilbert-base-japanese) |
 - RetrievaBERT - jp/bert-1.3b) |
 - 東北大BERT - 100 約3億9,200万文 (74.3GB) | 東北大 自然言語処理研究グループ | base (v1, v2) & large: CC BY-SA 3.0 base (v3) & large (v2): Apache 2.0 |◯ ([base (v1)](https://huggingface.co/tohoku-nlp/bert-base-japanese-whole-word-masking), [base (v1, 文字レベル)](https://huggingface.co/tohoku-nlp/bert-base-japanese-char-whole-word-masking), [base (v2)](https://huggingface.co/tohoku-nlp/bert-base-japanese-v2), [base (v2, 文字レベル)](https://huggingface.co/tohoku-nlp/bert-base-japanese-char-v2), [large](https://huggingface.co/tohoku-nlp/bert-large-japanese), [large (文字レベル)](https://huggingface.co/tohoku-nlp/bert-large-japanese-char), [base (v3)](https://huggingface.co/tohoku-nlp/bert-base-japanese-v3), [base (v3, 文字レベル)](https://huggingface.co/tohoku-nlp/bert-base-japanese-char-v3), [large (v2)](https://huggingface.co/tohoku-nlp/bert-large-japanese-v2), [large (v2, 文字レベル)](https://huggingface.co/tohoku-nlp/bert-large-japanese-char-v2)) |
 - 日本語LayoutLM - SA 3.0 | [◯](https://huggingface.co/jri-advtechlab/layoutlm-wikipedia-ja) |
 - 東北大BERT - 100 約3億9,200万文 (74.3GB) | 東北大 自然言語処理研究グループ | base (v1, v2) & large: CC BY-SA 3.0 base (v3) & large (v2): Apache 2.0 |◯ ([base (v1)](https://huggingface.co/tohoku-nlp/bert-base-japanese-whole-word-masking), [base (v1, 文字レベル)](https://huggingface.co/tohoku-nlp/bert-base-japanese-char-whole-word-masking), [base (v2)](https://huggingface.co/tohoku-nlp/bert-base-japanese-v2), [base (v2, 文字レベル)](https://huggingface.co/tohoku-nlp/bert-base-japanese-char-v2), [large](https://huggingface.co/tohoku-nlp/bert-large-japanese), [large (文字レベル)](https://huggingface.co/tohoku-nlp/bert-large-japanese-char), [base (v3)](https://huggingface.co/tohoku-nlp/bert-base-japanese-v3), [base (v3, 文字レベル)](https://huggingface.co/tohoku-nlp/bert-base-japanese-char-v3), [large (v2)](https://huggingface.co/tohoku-nlp/bert-large-japanese-v2), [large (v2, 文字レベル)](https://huggingface.co/tohoku-nlp/bert-large-japanese-char-v2)) |
 - TohokuNLP BERT-alpha 500M - jp-corpus-v3](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v3) の日本語サブセット | 東北大 自然言語処理研究グループ | Apache 2.0 | ◯ ([sq4096-alpha](https://huggingface.co/tohoku-nlp/tohokunlp-bert-500m-sq4096-alpha), [sq8192-alpha](https://huggingface.co/tohoku-nlp/tohokunlp-bert-500m-sq8192-alpha)) |
 - 京大BERT
 - NICT BERT
 - colorfulscoop BERT - SA 3.0 | [◯](https://huggingface.co/colorfulscoop/bert-base-ja) |
 - chiTra (Sudachi Transformers)
 - ACCMS BERT - SA 4.0 | [◯](https://huggingface.co/ku-accms/bert-base-japanese-ssuw) |
 - 日立BERT - 100 | 日立製作所 | CC BY-NC-SA 4.0 | [◯](https://huggingface.co/hitachi-nlp/bert-base-japanese_jumanpp-bpe) [^6] |
 - Bandai Namco DistilBERT - （東北大BERT(base) を親モデルとして知識蒸留） | Bandai Namco Research | MIT | [◯](https://huggingface.co/bandainamco-mirai/distilbert-base-japanese) |
 - LINE DistilBERT - （LINE社内のBERTを親モデルとして知識蒸留）| LINE | Apache 2.0 | [◯](https://huggingface.co/line-corporation/line-distilbert-base-japanese) |
 - rinna RoBERTa - 100 | rinna | MIT | [◯](https://huggingface.co/rinna/japanese-roberta-base) |
 - 早大RoBERTa - 100 | 早大河原研 | CC BY-SA 4.0 | ◯ ([base](https://huggingface.co/nlp-waseda/roberta-base-japanese-with-auto-jumanpp), [large](https://huggingface.co/nlp-waseda/roberta-large-japanese-with-auto-jumanpp), [large (seq512)](https://huggingface.co/nlp-waseda/roberta-large-japanese-seq512-with-auto-jumanpp)) [^7] |
 - インフォマティクスRoBERTa
 - 京大RoBERTa - 100 | 京大言語メディア研究室 | CC BY-SA 4.0 | ◯ ([base (文字レベル)](https://huggingface.co/ku-nlp/roberta-base-japanese-char-wwm), [large (文字レベル)](https://huggingface.co/ku-nlp/roberta-large-japanese-char-wwm)) |
 - 横浜国大RoBERTa - base-janpanese) |
 - Megagon Labs RoBERTa - long-japanese) |
 - ACCMS RoBERTa - 100 (70GB) | 京大 ACCMS | CC BY-SA 4.0 | [◯](https://huggingface.co/ku-accms/roberta-base-japanese-ssuw) |
 - Megagon Labs ELECTRA - base-japanese-discriminator) |
 - 日本語RoFormer - base-japanese) |
 - 日本語LUKE - ousia/luke-japanese-base-lite), [large](https://huggingface.co/studio-ousia/luke-japanese-large-lite)) |
 - 京大DeBERTaV2 - 100 + Japanese OSCAR （計171GB） | 京大言語メディア研究室 | CC BY-SA 4.0 | ◯ ([tiny](https://huggingface.co/ku-nlp/deberta-v2-tiny-japanese), [tiny (文字レベル)](https://huggingface.co/ku-nlp/deberta-v2-tiny-japanese-char-wwm), [base](https://huggingface.co/ku-nlp/deberta-v2-base-japanese), [large](https://huggingface.co/ku-nlp/deberta-v2-large-japanese)) |
 - 京大DeBERTaV3 - jp-corpus](https://github.com/llm-jp/llm-jp-corpus) | 京大言語メディア研究室 | Apache 2.0 | [◯](https://huggingface.co/ku-nlp/deberta-v3-base-japanese) |
 - 日本語BigBird - 100 + Japanese OSCAR | 早大河原研 | CC BY-SA 4.0 | [◯](https://huggingface.co/nlp-waseda/bigbird-base-japanese) |
 - 東大DeBERTaV2 - 100, Japanese mC4, Japanese OSCAR | 東大和泉研 | CC BY-SA 4.0 | ◯ ([small](https://huggingface.co/izumi-lab/deberta-v2-small-japanese), [base](https://huggingface.co/izumi-lab/deberta-v2-base-japanese)) |
- ドメイン特化型
 - UTH-BERT - NC-SA 4.0 | △ |
 - みんぱくBERT - v1](https://huggingface.co/ohshimalab/bert-base-minpaku-v1), [minpaku-v3](https://huggingface.co/ohshimalab/bert-base-minpaku-v3), [minpaku-v3-no-additional-token](https://huggingface.co/ohshimalab/bert-base-minpaku-v3-no-additional-token)) |
 - medBERTjp - NC-SA 4.0 | △ |
 - AcademicRoBERTa
 - 日本語ニュースBERT
 - 日本語ニュースXLNet - japanese) |
 - 日本語ニュースALBERT
 - Laboro BERT - NC 4.0 | ✕ |
 - Laboro DistilBERT - （Laboro BERT(base) を親モデルとして知識蒸留）| Laboro.AI | CC BY-NC 4.0 | [◯](https://huggingface.co/laboro-ai/distilbert-base-japanese) |
 - 日本語ブログELECTRA - SA 4.0 | [◯](https://huggingface.co/ptaszynski/yacis-electra-small-japanese) |
 - 日本語話し言葉BERT - jp/japanese-spoken-language-bert) |
 - JLBert
 - JMedRoBERTa - NC-SA 4.0 | ◯ ([万病WordPiece](https://huggingface.co/alabnii/jmedroberta-base-manbyo-wordpiece), [SentencePiece](https://huggingface.co/alabnii/jmedroberta-base-sentencepiece)) [^10] |
 - local-politics-BERT - SA 4.0 | ◯ ([SC-min](https://huggingface.co/local-politics-jp/bert-base-japanese-minutes-scratch), [SC-minwiki](https://huggingface.co/local-politics-jp/bert-base-japanese-minutes-wikipedia-scratch), [SC-2M-wiki](https://huggingface.co/local-politics-jp/bert-base-japanese-wikipedia-scratch-2m), [SC-2M-min](https://huggingface.co/local-politics-jp/bert-base-japanese-minutes-scratch-2m), [SC-2M-minwiki](https://huggingface.co/local-politics-jp/bert-base-japanese-minutes-wikipedia-scratch-2m), [FP-min](https://huggingface.co/local-politics-jp/bert-base-japanese-minutes-further), [FP-minwiki](https://huggingface.co/local-politics-jp/bert-base-japanese-minutes-wikipedia-further)) [^18] |
埋め込み (Embeddings) 作成に特化したモデル
- ドメイン特化型
 - cl-nagoya/unsup-simcse-ja-base - nagoya/unsup-simcse-ja-large](https://huggingface.co/cl-nagoya/unsup-simcse-ja-large) [cl-nagoya/sup-simcse-ja-base](https://huggingface.co/cl-nagoya/sup-simcse-ja-base) [cl-nagoya/sup-simcse-ja-large](https://huggingface.co/cl-nagoya/sup-simcse-ja-large) | SimCSE | 名大武田・笹野研 | CC BY-SA 4.0 |
 - pkshatech/GLuCoSE-base-ja
 - Japanese SimCSE - nagoya/unsup-simcse-ja-base](https://huggingface.co/cl-nagoya/unsup-simcse-ja-base), [cl-nagoya/unsup-simcse-ja-large](https://huggingface.co/cl-nagoya/unsup-simcse-ja-large), [cl-nagoya/sup-simcse-ja-base](https://huggingface.co/cl-nagoya/sup-simcse-ja-base), [cl-nagoya/sup-simcse-ja-large](https://huggingface.co/cl-nagoya/sup-simcse-ja-large)) | SimCSE | 名大武田・笹野研 | CC BY-SA 4.0 |
 - cl-nagoya/shioriha-large-pt
 - JaColBERT
 - JaColBERTv2.5
 - JaColBERTv2
 - bclavie/fio-base-japanese-v0.1
 - GLuCoSE - base-ja](https://huggingface.co/pkshatech/GLuCoSE-base-ja)) | LUKEベースの文埋め込みモデル (GLuCoSE) | PKSHA Technology | Apache 2.0 |
 - JaColBERT
 - colorfulscoop/sbert-base-ja - BERT | Colorful Scoop | CC BY-SA 4.0 |
 - MU-Kindai/SBERT-JSNLI-base - Kindai/SBERT-JSNLI-large](https://huggingface.co/MU-Kindai/SBERT-JSNLI-large) | Sentence-BERT | 近畿大学 (研究室不明) | ？ |
 - MU-Kindai/Japanese-SimCSE-BERT-base-unsup - Kindai/Japanese-SimCSE-BERT-large-unsup](https://huggingface.co/MU-Kindai/Japanese-SimCSE-BERT-large-unsup) [MU-Kindai/Japanese-SimCSE-RoBERTa-base-unsup](https://huggingface.co/MU-Kindai/Japanese-SimCSE-RoBERTa-base-unsup) [MU-Kindai/Japanese-SimCSE-BERT-base-sup](https://huggingface.co/MU-Kindai/Japanese-SimCSE-BERT-base-sup) [MU-Kindai/Japanese-SimCSE-BERT-large-sup](https://huggingface.co/MU-Kindai/Japanese-SimCSE-BERT-large-sup) | SimCSE | 近畿大学 (研究室不明) | MIT |
 - pkshatech/simcse-ja-bert-base-clcmlp - SA 4.0 |
 - MU-Kindai/Japanese-MixCSE-BERT-base - Kindai/Japanese-MixCSE-BERT-large](https://huggingface.co/MU-Kindai/Japanese-MixCSE-BERT-large) | MixCSE | 近畿大学 (研究室不明) | MIT |
 - MU-Kindai/Japanese-DiffCSE-BERT-base
日本語LLM評価ベンチマーク/データセットまとめ
- 複合型ベンチマーク
 - Nejumi LLMリーダーボード Neo
 - Nejumi LLMリーダーボード Neo - jp-eval](#llm-jp-eval) とプロンプト対話で生成能力を評価する [Japanese MT-bench](#jp-mt-bench) による総合評価の結果をまとめている。 | Weights & Biases |
 - Nejumi LLMリーダーボード3
 - 日本語LLM評価 - evaluation](https://github.com/swallow-llm/swallow-evaluation) を合わせて公開している。 | Swallow Project |
- 基礎的な自然言語理解 (NLU) を中心に測定するベンチマーク/データセット
 - llm-jp-eval リーダーボード - jp)
 - [rinna
 - 日本語 Open LLM Leaderboard - jp)
 - Nejumi LLMリーダーボード
 - llm-jp-eval - jp/llm-jp-eval/tree/main/src/llm_jp_eval/jaster)から確認できる（この中には JNLI や JCommonsenseQA といった JGLUE のタスクなども含まれている）。 評価結果は [llm-jp-eval リーダーボード](http://wandb.me/llm-jp-leaderboard) にまとめられている。 | LLM-jp |
 - JMMLU
 - JGLUE - ja, JCoLA, JSTS, JNLI, JSQuAD, JCommonsenseQA の 6 つのタスクを含む（[JCoLA](https://github.com/osekilab/JCoLA) は東大大関研により作成）。各タスクの詳細は[こちら](https://www.jstage.jst.go.jp/article/jnlp/30/1/30_63/_article/-char/ja)や[こちら](https://techblog.yahoo.co.jp/entry/2022122030379907/)を参照 | 早大河原研, ヤフー |
 - こちら
 - こちら
 - JP Language Model Evaluation Harness - evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) のフォーク。複数のデータセットを横断して日本語 LLM を自動評価するツールである。 対応している全データセット一覧は[こちら](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable/lm_eval/tasks/ja)から確認できる（この中には JNLI や JCommonsenseQA といった JGLUE のタスクなども含まれている）。 rinna による詳細な評価結果まとめがある: [[rinna] Benchmark of Stability-AI/lm-evaluation-harness](https://rinnakk.github.io/research/benchmarks/lm/) | Stability AI |
 - GLUE ベンチマーク - ja, JCoLA, JSTS, JNLI, JSQuAD, JCommonsenseQA の 6 つのタスクを含む（[JCoLA](https://github.com/osekilab/JCoLA) は東大大関研により作成）。各タスクの詳細は[こちら](https://www.jstage.jst.go.jp/article/jnlp/30/1/30_63/_article/-char/ja)や[こちら](https://techblog.yahoo.co.jp/entry/2022122030379907/)を参照
 - Open LLM Leaderboard
- 特定ドメインの性能を測定するベンチマーク/データセット
 - こちら
 - Japanese Language Model Financial Evaluation Harness - 4.pdf)を参照 | Preferred Networks |
 - pfmt-bench-fin-ja
 - JMED-LLM
 - こちら
 - Stockmark Business Questions
 - karakuri-bench
- 人間らしい応答の生成能力を中心に測定するベンチマーク/データセット
 - Rakuda Benchmark - questions)に対してモデルに出力を行わせる。GPT-4 が同じ質問に対する2つのモデルの出力を比べ、どちらの答えが優れているかを判断することにより、モデルのランク付けを行う。 | YuzuAI |
 - Japanese Vicuna QA Benchmark - Bench の前身である [vicuna-blog-eval](https://github.com/lm-sys/vicuna-blog-eval) の日本語版。一般、知識、ロールプレイ、常識、フェルミ推定、反実仮想、コーディング、数学、ライティングに関する 80 問の質問を収録している。また、GPT-4 による自動評価（勝率計算）のスクリプトも含まれている。リーダーボードは[こちら](http://wandb.me/llm-jp-vicunaleaderboard) | 京大言語メディア研究室 |
 - Shaberi - bench](#jp-mt-bench)、[Rakuda Benchmark](#rakuda-benchmark)、[ELYZA-tasks-100](#elyza-tasks)、[Tengu-Bench](#tengu-bench) の評価をまとめて行うことができるフレームワーク。なお、Shisa.AI による[フォーク](https://github.com/shisa-ai/shaberi)も存在する | Lightblue |
 - Tengu-Bench
 - Japanese MT-bench - bench](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) の日本語版。Writing, Roleplay, Reasoning, Math, Coding, Extraction, STEM, Humanities の 8 つのカテゴリから 10 問ずつ、計 80 問が収録されている。なお、日本語版作成の際には、日本の文化に合うように質問内容に一部修正が加えられている。 GPT-4 による 10 段階の絶対評価を行うスクリプトも含まれている。 | Stability AI |
 - MT-bench - 4 による 10 段階の絶対評価を行うスクリプトも含まれている。
 - 40問の自由質問 - 4 が同じ質問に対する2つのモデルの出力を比べ、どちらの答えが優れているかを判断することにより、モデルのランク付けを行う。
 - ELYZA-tasks-100
 - こちら
- 制約付きの生成能力を測定するベンチマーク/データセット
 - LCTG Bench
- 論理推論能力を測定するベンチマーク/データセット
 - JFLD (Japanese Formal Logic Deduction) - nlp/FLD) の日本語版）。LLM が持つ知識と切り分けて評価を行うために、反実仮想的なサンプルから構成されているのが特徴である。 | 日立製作所 |
 - JFLD (Japanese Formal Logic Deduction)
 - JHumanEval
 - HumanEval
 - JHumanEval
- 視覚言語モデル (Vision-Language Models) のベンチマーク/データセット
 - JA-Multi-Image-VQA
 - Japanese-Heron-Bench
 - LLaVA-Bench-In-the-Wild
 - JA-VLM-Bench-In-the-Wild - JP-v1-7B の評価のために独自に用意したデータセット。42 枚の画像に対して計 50 問の質問が割り当てられている。日本に関する知識を要求する画像・質問になっているのが特徴である。 | Sakana AI |
 - LLaVA-Bench (COCO) Japanese - Bench (COCO) データセットを DeepL で日本語に訳したもの。30 枚の画像に対して各 3 種類の質問が割り当てられている。 | Turing |
 - LLaVA-Bench-In-the-Wild (Japanese) - Bench-In-the-Wild](https://huggingface.co/datasets/liuhaotian/llava-bench-in-the-wild) を DeepL で日本語に訳したもの。24 枚の画像に対して計 60 問の質問が割り当てられている。 | Turing |
 - Heron VLM リーダーボード powered by nejumi@WandB
- 埋め込みモデルのベンチマーク/データセット
 - JMTEB
 - JQaRA
 - JaCWIR
 - JMTEB - benchmark/mteb)の日本語版として作成されたベンチマーク。 文書クラスタリング、文書分類、文間類似度、文ペアラベル予測、文書抽出の5種類のタスクから構成されている（その後、リランキングタスクが新たに追加）。 | SB Intuitions |
- 事実性・安全性を測定するベンチマーク/データセット
 - JTruthfulQA
 - JCommonsenseMorality
 - JBBQ - mll/BBQ) を、日本の文化・慣習を踏まえて翻訳、修正、問題追加を行い作成されたデータセット。 | 東大谷中研 |
引用
- 特定ドメインの性能を測定するベンチマーク/データセット
  - ^5 - cho.work/) の運営も行っている
  - ^11
  - ^13
- 視覚言語モデル (Vision-Language Models) のベンチマーク/データセット
  - ^19
  - ^15
  - Exploring Open Large Language Models for the Japanese Language: A Practical Guide
  - ^1
  - ^18 - 2M-wiki モデルは Wikipedia でのみ事前学習されているため、厳密にはドメイン特化型モデルではない。
各モデル・アーキテクチャの原論文
- 視覚言語モデル (Vision-Language Models) のベンチマーク/データセット
視覚言語モデル (Vision-Language Models)
- 画像+テキストからのテキスト生成
 - AXCXEPT/EZO-InternVL2-26B - | 　Axcxept | MIT |
 - AXCXEPT/Llama-3-EZO-VLM-1 - 3-EvoVLM-JP-v2 | - | Axcxept | Llama 3 Community License |
 - Japanese InstructBLIP Alpha - instructblip-alpha](https://huggingface.co/stabilityai/japanese-instructblip-alpha)) | InstructBLIP | Japanese CC12M, STAIR Captions, Japanese Visual Genome VQA dataset | Stability AI | JAPANESE STABLELM RESEARCH LICENSE |
 - Llama-3-EvoVLM-JP-v2 - 3-EvoVLM-JP-v2)) | - | - （Mantis-8B-SigLIP-Llama-3、Llama-3-ELYZA-JP-8B、Bunny-v1.1-Llama-3-8B-V をマージ） | Sakana AI | Llama 3 Community License |
 - Japanese Stable VLM - stable-vlm](https://huggingface.co/stabilityai/japanese-stable-vlm)) | LLaVA-1.5 | Japanese CC12M, STAIR Captions, Japanese Visual Genome VQA dataset | Stability AI | STABILITY AI JAPANESE STABLE VLM COMMUNITY LICENSE |
 - watashiha/Watashiha-Llama-2-13B-Ogiri-sft-vlm
 - llava-calm2-siglip - calm2-siglip](https://huggingface.co/cyberagent/llava-calm2-siglip)) | LLaVA-1.5 | MS-COCO と VisualGenome から生成された対話データ | サイバーエージェント | Apache 2.0 |
 - Heron - ja-stablelm-base-7b-v0](https://huggingface.co/turing-motors/heron-chat-blip-ja-stablelm-base-7b-v0), [blip-ja-stablelm-base-7b-v1](https://huggingface.co/turing-motors/heron-chat-blip-ja-stablelm-base-7b-v1), [blip-ja-stablelm-base-7b-v1-llava-620k](https://huggingface.co/turing-motors/heron-chat-blip-ja-stablelm-base-7b-v1-llava-620k), [git-ja-stablelm-base-7b-v0](https://huggingface.co/turing-motors/heron-chat-git-ja-stablelm-base-7b-v0), [git-ELYZA-fast-7b-v0](https://huggingface.co/turing-motors/heron-chat-git-ELYZA-fast-7b-v0), [git-ja-stablelm-base-7b-v1](https://huggingface.co/turing-motors/heron-chat-git-ja-stablelm-base-7b-v1)) | BLIP-2 または GIT | v1: LLaVA-Instruct-150K-JA または LLaVA-Instruct-620K-JA v0: LLaVA-Instruct-150K-JA, Japanese STAIR Captions, Japanese Visual Genome VQA dataset | Turing | CC BY-NC 4.0 |
- テキストからの画像生成
 - EvoSDXL-JP - JP-v1)) | Stable Diffusion | - （Japanese Stable Diffusion XL を含む複数の画像生成モデルをマージ） | Sakana AI | Apache 2.0[^14] |
 - Evo-Ukiyoe - Ukiyoe-v1)) | Stable Diffusion | 浮世絵 | Sakana AI | Apache 2.0[^14] |
 - CommonArt β - beta](https://huggingface.co/aipicasso/commonart-beta)) | PixArt-Σ | CommonCatalog-cc-by, Megalith-10M, Smithonian Open Access, ArtBench (CC-0 only) | AI Picasso | Apache 2.0 |
 - Japanese Stable Diffusion XL - stable-diffusion-xl](https://huggingface.co/stabilityai/japanese-stable-diffusion-xl)) | Stable Diffusion | 不明 | Stability AI | STABILITY AI JAPANESE STABLE DIFFUSION XL COMMUNITY LICENSE |
 - 東北大Stable Diffusion - nlp/stable-diffusion-xl-jp-base-1.0), [refiner](https://huggingface.co/tohoku-nlp/stable-diffusion-xl-jp-refiner-1.0)) | Stable Diffusion | WMT2023 Shared Task の日英対訳コーパス、laion2B-multi のキャプション約 1,300 万件 | 東北大 自然言語処理研究グループ | CreativeML OpenRAIL-M License |
 - rinna Stable Diffusion - stable-diffusion](https://huggingface.co/rinna/japanese-stable-diffusion)) | Stable Diffusion | LAION-5B データセットのうちキャプションが日本語のもの（画像約 1 億枚）| rinna | CreativeML OpenRAIL-M License |
- その他
 - リクルートCLIP - clip-vit-b-32-roberta-base](https://huggingface.co/recruit-jp/japanese-clip-vit-b-32-roberta-base)) | CLIP | laion2B-multi のキャプション約1億2000万件 | リクルート | CC BY-4.0 |
 - 博報堂テクノロジーズCLIP - tech/japanese-clip-vit-h-14-bert-base), [deeper](https://huggingface.co/hakuhodo-tech/japanese-clip-vit-h-14-bert-deeper), [wider](https://huggingface.co/hakuhodo-tech/japanese-clip-vit-h-14-bert-wider)) | CLIP | laion2B-multi のキャプション約1億2000万件 | 博報堂テクノロジーズ | CC BY-NC-SA 4.0 |
 - LINEヤフーCLIP - japanese-base](https://huggingface.co/line-corporation/clip-japanese-base)) | CLIP | CommonCrawl, CC12M, YFCC100M | LINEヤフー | Apache 2.0 |
 - Japanese Stable CLIP - stable-clip-vit-l-16](https://huggingface.co/stabilityai/japanese-stable-clip-vit-l-16)) | SigLIP | CC12M のキャプションを日本語に翻訳したもの、STAIR Captions | Stability AI | STABILITY AI JAPANESE STABLE CLIP COMMUNITY LICENSE |
 - rinna CLOOB - cloob-vit-b-16](https://huggingface.co/rinna/japanese-cloob-vit-b-16)) | CLOOB | CC12M のキャプションを日本語に翻訳したもの | rinna | Apache 2.0 |
LLMの学習手法の原論文
- 視覚言語モデル (Vision-Language Models) のベンチマーク/データセット
- 特定ドメインの性能を測定するベンチマーク/データセット
  - SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
音声言語モデル (Speech-Language Models)
- 音声認識
 - Kotoba-Whisper - tech/kotoba-whisper-v1.0), [v1.0-ggml](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0-ggml), [v1.0-faster](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0-faster), [v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1)) | Distil-Whisper | ReazonSpeech | Kotoba Technologies | Apache 2.0 |
 - Nue ASR - asr](https://huggingface.co/rinna/nue-asr)) | Nue ASR (HuBERT + LLM) | ReazonSpeech | rinna | Apache 2.0 |
 - ReazonSpeech - v1](https://huggingface.co/reazon-research/reazonspeech-espnet-v1), [espnet-next](https://huggingface.co/reazon-research/reazonspeech-espnet-next), [espnet-v2](https://huggingface.co/reazon-research/reazonspeech-espnet-v2), [nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2)) | ESPnet (Conformer-Transducer) または NeMo (FastConformer-RNNT) | ReazonSpeech | レアゾン・ホールディングス | Apache 2.0 |
- その他
 - Kotoba-Speech - tech/kotoba-speech-v0.1)) | Transformer | 不明 | Kotoba Technologies | Apache 2.0 |
 - 東大HuBERT - jtube](https://huggingface.co/sarulab-speech/hubert-base-jtube)) | HuBERT | JTubeSpeech | 東大猿渡・高道研 | MIT |
 - rinna HuBERT - hubert-base), [large](https://huggingface.co/rinna/japanese-hubert-large)) | HuBERT | ReazonSpeech | rinna | Apache 2.0 |

Programming Languages

Python 15 Jupyter Notebook 3 Shell 1 HTML 1

Ecosyste.ms: Awesome

awesome-japanese-llm

テキスト生成に主に使うモデル

フルスクラッチ事前学習モデル

海外モデルに日本語で継続事前学習を行ったモデル

海外モデルに日本語で事後学習のみ行ったモデル

フルスクラッチ学習モデル

APIとして提供されているモデル

複数のLLMをマージして作成されたモデル

海外モデルに日本語で追加事前学習を行ったモデル（継続事前学習モデル）

英語モデルに日本語で追加事前学習を行ったモデル（継続事前学習モデル）

海外モデルに日本語で指示チューニング (Instruction Tuning) のみ行ったモデル

入力テキストの処理に主に使うモデル

汎用

ドメイン特化型

埋め込み (Embeddings) 作成に特化したモデル

ドメイン特化型

日本語LLM評価ベンチマーク/データセットまとめ

複合型ベンチマーク

基礎的な自然言語理解 (NLU) を中心に測定するベンチマーク/データセット

特定ドメインの性能を測定するベンチマーク/データセット

人間らしい応答の生成能力を中心に測定するベンチマーク/データセット

制約付きの生成能力を測定するベンチマーク/データセット

論理推論能力を測定するベンチマーク/データセット

視覚言語モデル (Vision-Language Models) のベンチマーク/データセット

埋め込みモデルのベンチマーク/データセット

事実性・安全性を測定するベンチマーク/データセット

引用

特定ドメインの性能を測定するベンチマーク/データセット

視覚言語モデル (Vision-Language Models) のベンチマーク/データセット

各モデル・アーキテクチャの原論文

視覚言語モデル (Vision-Language Models) のベンチマーク/データセット

視覚言語モデル (Vision-Language Models)

画像+テキストからのテキスト生成

テキストからの画像生成

その他

LLMの学習手法の原論文

視覚言語モデル (Vision-Language Models) のベンチマーク/データセット

特定ドメインの性能を測定するベンチマーク/データセット

音声言語モデル (Speech-Language Models)

音声認識

その他