Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

awesome-latest-LLM

最新LLMの一覧を作成します
https://github.com/stardust-coder/awesome-latest-LLM

Gemma
日本語Mamba 2.8B
QWen
リコー
13Bモデル
Swallow
Mixtral-8x7B
日本語LLMの学習データを問題視する記事
Phi-3(Microsoft) - 3-medium-128k-instruct) | 3.8B, 13B | MIT | Phi-3 datasets | - | |
Llama 3(Meta) - llama/Meta-Llama-3-70B-Instruct) | 70B | [META LLAMA3](https://llama.meta.com/llama3/license/) | || [extended to 120B](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct) |
Mixtral-8x22B(Mistral) - community/Mixtral-8x22B-v0.1) | 8x22B | apache-2.0 | || MoE |
Command-R+(Cohere) - command-r-plus) | 104B | non commercial | || RAG capability |
Grok-1
BTX(Meta)
Command-R(Cohere) - command-r-v01) | 35B | non commercial | || RAG capability |
Aya(Cohere) - 101) | 13B | apache-2.0 | || multilingual |
Gemma(Google)
Miqu - 1-70b/tree/main) | 70B | none ||| leaked from Mistral |
Reka Flash
LongNet(Microsoft) - | apache-2.0 | [MAGNETO](https://arxiv.org/pdf/2210.06423.pdf)| input 1B token| |
gigaGPT(Cerebras) - 2.0 | | |
Mixtral-8x7B - 8x7B-Instruct-v0.1) | 8x7B | apache-2.0 |||MoE, [offloading](https://github.com/dvmazur/mixtral-offloading)|
Mamba - spaces/mamba-2.8b) | 2.8B | apache-2.0 | based on state space model| |
QWen(Alibaba) - 72B) | 72B | [license](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT)| 3T tokens | | beats Llama2 |
Self-RAG - 2.0 | 13B | | | critic model |
TinyLlama - 1.1B-intermediate-step-1431k-3T) | apache-2.0 | 1.1B | based on Llama, 3T token | | |
Xwin-LM - LM/Xwin-LM-70B-V0.1) | 70B | Llama2 |based on Llama2| also codes and math|
Llama2(Meta) - llama) | 70B | Llama2 | 2T tokens| chat-hf seems the best|
Amber - 2.0 | Llama|| totally open| -->
Phi-1.5(Microsoft) - 1_5) | 1.3B| MSRA-license||textbooks| -->
LLama3ELYZA-JP-8B - 3-ELYZA-JP-8B) | 8B | Llama3 | Llama3 | | 70B not open |
KARAKURI LM 8x7B - ai/karakuri-lm-8x7b-chat-v0.1) | 8x7B | Apache-2.0 | | | MoE |
KARAKURI - ai/karakuri-lm-70b-v0.1) | 70B | cc-by-sa-4.0 | Llama2-70Bベース | | [note](https://note.com/ngc_shj/n/n46ced665b378?sub_rt=share_h)|
ELYZA-japanese-Llama-2-13b - japanese-Llama-2-13b) | 13B | | Llama-2-13b-chatベース |
Swallow(東工大) - llm) | 70B | | Llama2-70Bベース |
StableLM(StabilityAI) - stablelm-base-beta-70b) | 70B | | Llama2-70Bベース |
LLM-jp - jp) | 13B | DPO追加あり |
awesome-japanese-llm
MMed-LLama3-8B(上海交通大学) - Llama-3-8B) | 8B | cc-by-sa | Llama3 | | | |
Meditron(EPFL) - llm/meditron-70B) | 70B | Llama2 | Llama2 | GAP-Replay(48.1B) | [dataset](img/meditron-testdata.png),[score](img/meditron-eval2.png) | |
Med-Gemini(Google) - | Gemini | | |multimodal|
Hippocrates
AdaptLLM(Microsoft Research) - LLM-13B) | 7B, 13B | | reading comprehensive corpora | | | | ICLR2024 |
Apollo - 7B) | ~7B | | | | | | multilingual |
BiMediX - commercial | 8x7B | mixtral8x7B | | | MoE |
Health-LLM(Rutgersなど)
BioMistral - | | | | |
AMIE(Google) - | - | based on PaLM 2 | | | EHR|
JMedLoRA(UTokyo) - CVM-utokyohospital/llama2-jmedlora-3000) | 70B | none | none | QLoRA | IgakuQA | Japanese, insufficient quality |
Meditron(EPFL) - llm/meditron-70B) | 70B | Llama2 | Llama2 | GAP-Replay(48.1B) | [dataset](img/meditron-testdata.png),[score](img/meditron-eval2.png) | |
BioMedGPT(Luo et al.)
PMC-LLaMa
Med-Flamingo
LLaVa-Med(Microsoft) - med-7b-delta) | 13B | - | LLaVa| medical dataset | VAQ-RAD, SLAKE, PathVQA |multi-modal|
Med-PaLM M(Google) - | PaLM2 | | |multi-modal|
Almanac(Stanford) - davinci-003 | | | RAG |
Med-PaLM2(Google) - | PaLM2 | | |
Med-PaLM(Google) - | PaLM | | | |
Awesome-Healthcare-Foundation-Models - in-Health/MedLLMsPracticalGuide).
医療ドメイン特化LLMの性能はどうやって評価する？
MIRAGE Leaderboard
Japanese Medical Language Model Evaluation Harness
Open Medical LLM leaderboard
MMedBench
MedQA (USMLE)
MedMCQA
PubMedQA
PubHealth
MMLU
JMMLU - translated version of MMLU
IgakuQA（Japanese National Medical License Exam）
J-ResearchCorpus
Apollo Corpus JP
MedICaT
VQA-RAD
MedVTE
MedAlign(Stanford)
MIMIC-ECG-IV - caption dataset
ECG-QA
Clinical NLP 2023
He et al.(2023)

Programming Languages

Keywords

llm 3 multimodal 2 natural-language-processing 2 large-language-models 2 japanese 1 generative-models 1 generative-model 1 generative-ai 1 foundation-models 1 embeddings 1 awesome-list 1 awesome 1 pretrained-models 1 flash-attention 1 chinese 1 translation 1 transformer 1 speech-processing 1 pretrained-language-model 1 machine-learning 1 computer-vision 1 transfer-learning 1 muti-task 1 gpt-3 1 few-shot-learning 1 open-source 1 medical 1 question-answering 1 ptb-xl 1 mimic-iv-ecg 1 ekg 1 ecg-qa 1 ecg 1 vision-language-pretraining 1 vision-language-model 1 vision-language 1 vision-and-language 1 llms 1 large-language-model 1 language-models 1 language-model 1 japanese-language 1