Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-latest-LLM
最新LLMの一覧を作成します
https://github.com/stardust-coder/awesome-latest-LLM
- Gemma
- 日本語Mamba 2.8B
- QWen
- リコー
- 13Bモデル
- Swallow
- Mixtral-8x7B
- 日本語LLMの学習データを問題視する記事
- Phi-3(Microsoft) - 3-medium-128k-instruct) | 3.8B, 13B | MIT | Phi-3 datasets | - | |
- Llama 3(Meta) - llama/Meta-Llama-3-70B-Instruct) | 70B | [META LLAMA3](https://llama.meta.com/llama3/license/) | || [extended to 120B](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct) |
- Mixtral-8x22B(Mistral) - community/Mixtral-8x22B-v0.1) | 8x22B | apache-2.0 | || MoE |
- Command-R+(Cohere) - command-r-plus) | 104B | non commercial | || RAG capability |
- Grok-1
- BTX(Meta)
- Command-R(Cohere) - command-r-v01) | 35B | non commercial | || RAG capability |
- Aya(Cohere) - 101) | 13B | apache-2.0 | || multilingual |
- Gemma(Google)
- Miqu - 1-70b/tree/main) | 70B | none ||| leaked from Mistral |
- Reka Flash
- LongNet(Microsoft) - | apache-2.0 | [MAGNETO](https://arxiv.org/pdf/2210.06423.pdf)| input 1B token| |
- gigaGPT(Cerebras) - 2.0 | | |
- Mixtral-8x7B - 8x7B-Instruct-v0.1) | 8x7B | apache-2.0 |||MoE, [offloading](https://github.com/dvmazur/mixtral-offloading)|
- Mamba - spaces/mamba-2.8b) | 2.8B | apache-2.0 | based on state space model| |
- QWen(Alibaba) - 72B) | 72B | [license](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT)| 3T tokens | | beats Llama2 |
- Self-RAG - 2.0 | 13B | | | critic model |
- TinyLlama - 1.1B-intermediate-step-1431k-3T) | apache-2.0 | 1.1B | based on Llama, 3T token | | |
- Xwin-LM - LM/Xwin-LM-70B-V0.1) | 70B | Llama2 |based on Llama2| also codes and math|
- Llama2(Meta) - llama) | 70B | Llama2 | 2T tokens| chat-hf seems the best|
- Amber - 2.0 | Llama|| totally open| -->
- Phi-1.5(Microsoft) - 1_5) | 1.3B| MSRA-license||textbooks| -->
- LLama3ELYZA-JP-8B - 3-ELYZA-JP-8B) | 8B | Llama3 | Llama3 | | 70B not open |
- KARAKURI LM 8x7B - ai/karakuri-lm-8x7b-chat-v0.1) | 8x7B | Apache-2.0 | | | MoE |
- KARAKURI - ai/karakuri-lm-70b-v0.1) | 70B | cc-by-sa-4.0 | Llama2-70Bベース | | [note](https://note.com/ngc_shj/n/n46ced665b378?sub_rt=share_h)|
- ELYZA-japanese-Llama-2-13b - japanese-Llama-2-13b) | 13B | | Llama-2-13b-chatベース |
- Swallow(東工大) - llm) | 70B | | Llama2-70Bベース |
- StableLM(StabilityAI) - stablelm-base-beta-70b) | 70B | | Llama2-70Bベース |
- LLM-jp - jp) | 13B | DPO追加あり |
- awesome-japanese-llm
- MMed-LLama3-8B(上海交通大学) - Llama-3-8B) | 8B | cc-by-sa | Llama3 | | | |
- Meditron(EPFL) - llm/meditron-70B) | 70B | Llama2 | Llama2 | GAP-Replay(48.1B) | [dataset](img/meditron-testdata.png),[score](img/meditron-eval2.png) | |
- Med-Gemini(Google) - | Gemini | | |multimodal|
- Hippocrates
- AdaptLLM(Microsoft Research) - LLM-13B) | 7B, 13B | | reading comprehensive corpora | | | | ICLR2024 |
- Apollo - 7B) | ~7B | | | | | | multilingual |
- BiMediX - commercial | 8x7B | mixtral8x7B | | | MoE |
- Health-LLM(Rutgersなど)
- BioMistral - | | | | |
- AMIE(Google) - | - | based on PaLM 2 | | | EHR|
- JMedLoRA(UTokyo) - CVM-utokyohospital/llama2-jmedlora-3000) | 70B | none | none | QLoRA | IgakuQA | Japanese, insufficient quality |
- Meditron(EPFL) - llm/meditron-70B) | 70B | Llama2 | Llama2 | GAP-Replay(48.1B) | [dataset](img/meditron-testdata.png),[score](img/meditron-eval2.png) | |
- BioMedGPT(Luo et al.)
- PMC-LLaMa
- Med-Flamingo
- LLaVa-Med(Microsoft) - med-7b-delta) | 13B | - | LLaVa| medical dataset | VAQ-RAD, SLAKE, PathVQA |multi-modal|
- Med-PaLM M(Google) - | PaLM2 | | |multi-modal|
- Almanac(Stanford) - davinci-003 | | | RAG |
- Med-PaLM2(Google) - | PaLM2 | | |
- Med-PaLM(Google) - | PaLM | | | |
- Awesome-Healthcare-Foundation-Models - in-Health/MedLLMsPracticalGuide).
- 医療ドメイン特化LLMの性能はどうやって評価する?
- MIRAGE Leaderboard
- Japanese Medical Language Model Evaluation Harness
- Open Medical LLM leaderboard
- MMedBench
- MedQA (USMLE)
- MedMCQA
- PubMedQA
- PubHealth
- MMLU
- JMMLU - translated version of MMLU
- IgakuQA(Japanese National Medical License Exam)
- J-ResearchCorpus
- Apollo Corpus JP
- MedICaT
- VQA-RAD
- MedVTE
- MedAlign(Stanford)
- MIMIC-ECG-IV - caption dataset
- ECG-QA
- Clinical NLP 2023
- He et al.(2023)
Programming Languages
Keywords
llm
3
multimodal
2
natural-language-processing
2
large-language-models
2
japanese
1
generative-models
1
generative-model
1
generative-ai
1
foundation-models
1
embeddings
1
awesome-list
1
awesome
1
pretrained-models
1
flash-attention
1
chinese
1
translation
1
transformer
1
speech-processing
1
pretrained-language-model
1
machine-learning
1
computer-vision
1
transfer-learning
1
muti-task
1
gpt-3
1
few-shot-learning
1
open-source
1
medical
1
question-answering
1
ptb-xl
1
mimic-iv-ecg
1
ekg
1
ecg-qa
1
ecg
1
vision-language-pretraining
1
vision-language-model
1
vision-language
1
vision-and-language
1
llms
1
large-language-model
1
language-models
1
language-model
1
japanese-language
1