Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sail-sg/sailor-llm
Sailor: Open Language Models for South-East Asia
https://github.com/sail-sg/sailor-llm
indonesia language-model lao malay sea thai vietnam
Last synced: 3 months ago
JSON representation
Sailor: Open Language Models for South-East Asia
- Host: GitHub
- URL: https://github.com/sail-sg/sailor-llm
- Owner: sail-sg
- License: mit
- Created: 2024-02-27T06:59:21.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-07-11T02:59:30.000Z (5 months ago)
- Last Synced: 2024-07-11T04:19:50.774Z (5 months ago)
- Topics: indonesia, language-model, lao, malay, sea, thai, vietnam
- Language: Python
- Homepage: https://sailorllm.github.io
- Size: 2.06 MB
- Stars: 83
- Watchers: 7
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - sail-sg/sailor-llm
README
# Sailor: Open Language Models for South-East Asia
[![Homepage](https://img.shields.io/badge/🏠-Homepage-3C47EB.svg)](https://sailorllm.github.io/) [![HuggingFace](https://img.shields.io/badge/🤗-HuggingFace-E87948.svg)](https://huggingface.co/sail/Sailor-7B) [![Technical Report](https://img.shields.io/badge/arXiv-2404.03608-b31b1b.svg)](https://arxiv.org/pdf/2404.03608.pdf) [![SailCraft](https://img.shields.io/badge/🚢-SailCraft-4F94CD.svg)](https://github.com/sail-sg/sailcraft)
This repository contains the evaluation code for Sailor, a suite of open language models for South-East Asia.Sailor is developed by the [Sea AI Lab](https://sail.sea.com/) and [Singapore University of Technology and Design](https://istd.sutd.edu.sg/people/faculty/lu-wei/).
## Introduction
Sailor is a suite of Open Language Models tailored for South-East Asia (SEA), focusing on languages such as 🇮🇩Indonesian, 🇹🇭Thai, 🇻🇳Vietnamese, 🇲🇾Malay, and 🇱🇦Lao. Developed with careful data curation, Sailor models are designed to understand and generate text across diverse linguistic landscapes of SEA region. Built from Qwen 1.5, Sailor encompasses models of varying sizes, spanning from 0.5B to 14B versions for different requirements. Benchmarking results demonstrate Sailor's proficiency in tasks such as question answering, commonsense reasoning, reading comprehension and etc. in SEA languages.
- Continually pretrained on 200 Billion to 400 Billion tokens over 7 languages, including Indonesian, Thai, Vietnamese, Malay, Lao, English and Chinese.
- Various model sizes (0.5B, 1.8B, 4B, 7B and 14B) to support different requirements.
- Strong performance on SEA benchmarks such as XQuAD, TydiQA, XCOPA, Belebele and M3Exam.
- No restrict on the research and the commercial use, but should comply with the Qwen 1.5 license.To learn more details, please access the [technical report](https://arxiv.org/pdf/2404.03608.pdf).
## Models
You can find all the Sailor models in our Huggingface home page [here]([https://huggingface.co/sail](https://huggingface.co/collections/sail/sailor-language-models-65e19a749f978976f1959825)):
- [Sailor-0.5B](https://huggingface.co/sail/Sailor-0.5B)
- [Sailor-1.8B](https://huggingface.co/sail/Sailor-1.8B)
- [Sailor-4B](https://huggingface.co/sail/Sailor-4B)
- [Sailor-7B](https://huggingface.co/sail/Sailor-7B)
- [Sailor-14B](https://huggingface.co/sail/Sailor-14B)
- [Sailor-0.5B-Chat](https://huggingface.co/sail/Sailor-0.5B-Chat)
- [Sailor-1.8B-Chat](https://huggingface.co/sail/Sailor-1.8B-Chat)
- [Sailor-4B-Chat](https://huggingface.co/sail/Sailor-4B-Chat)
- [Sailor-7B-Chat](https://huggingface.co/sail/Sailor-7B-Chat)
- [Sailor-14B-Chat](https://huggingface.co/sail/Sailor-14B-Chat)## Evaluation
Here are the results of the evaluation of the models on question answering tasks. The evaluation results are presented in the form of tables, where the first column is the model name, and the reset columns are the performance on Thai (th), Indonesian (id), and Vietnamese (vi) languages, respectively. The results of Sailor models are highlighted in bold. You can find the full evaluation results on the different tasks and our evaluation code to reproduce the results in the [eval](eval) directory.
### Question Answering
3-shot (EM / F1)
XQuAD (th)
TydiQA (id)
XQuAD (vi)Qwen1.5-0.5B
14.19 / 23.35
20.71 / 32.64
19.85 / 35.38Sailor-0.5B
15.84 / 27.58
30.44 / 54.74
21.13 / 40.57Qwen1.5-1.8B
27.24 / 43.56
29.73 / 53.76
29.17 / 48.15Sailor-1.8B
32.72 / 48.66
40.88 / 65.37
34.22 / 53.35Qwen1.5-4B
34.03 / 53.40
48.32 / 72.68
43.71 / 63.86Sailor-4B
46.82 / 63.34
53.98 / 73.48
47.65 / 67.09Llama-2-7b
30.64 / 43.80
56.64 / 72.14
46.96 / 66.16Mistral-7B-v0.1
48.48 / 63.27
63.54 / 78.73
53.72 / 72.75SeaLLM-7b-Hybrid
49.70 / 67.62
50.62 / 75.21
49.62 / 70.74SeaLLM-7b-v2
34.55 / 55.13
52.21 / 77.00
46.19 / 72.11Qwen1.5-7B
53.79 / 69.30
57.17 / 77.28
56.63 / 76.99Sailor-7B
57.88 / 71.06
60.53 / 75.42
53.81 / 74.62### Setup
We use [OpenCompass](https://github.com/open-compass/opencompass) to evaluate the models. To install the required packages, run the following command under this folder:
```bash
# setup opencompass environment
conda create --name opencompass python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y
conda activate opencompass
git clone https://github.com/open-compass/opencompass opencompass
cd opencompass
pip install -e .
pip install pythainlp langid
mkdir data
```### Build Evaluation Script
To build the evaluation script, run the following command under this folder:
```bash
cp -r eval/configs/* opencompass/configs/
cp -r eval/data/* opencompass/data/
cp -r eval/datasets/* opencompass/opencompass/datasets/
cp eval/icl_sailor_evaluator.py opencompass/opencompass/openicl/icl_evaluator/
cp eval/sailor_text_postprocessors.py opencompass/opencompass/utils/
echo "from .icl_sailor_evaluator import AnsEvaluator, TextGenEvaluator # noqa" >> "opencompass/opencompass/openicl/icl_evaluator/__init__.py"
echo "from .sailor_text_postprocessors import * # noqa" >> "opencompass/opencompass/utils/__init__.py"
echo "from .xquad import * # noqa: F401, F403" >> "opencompass/opencompass/datasets/__init__.py"
echo "from .tydiqa_id import * # noqa: F401, F403" >> "opencompass/opencompass/datasets/__init__.py"
echo "from .xcopa_sea import * # noqa: F401, F403" >> "opencompass/opencompass/datasets/__init__.py"
echo "from .m3exam import * # noqa: F401, F403" >> "opencompass/opencompass/datasets/__init__.py"
echo "from .belebele import * # noqa: F401, F403" >> "opencompass/opencompass/datasets/__init__.py"
cp eval/eval_sailor.py opencompass/configs/
```### Run Evaluation
To run the evaluation, run the following command under this folder:
```bash
cd opencompass
python run.py configs/eval_sailor.py -w outputs/sailor --hf-num-gpus 1 --max-num-workers 64
```You can also modify the script to evaluate other models like Qwen1.5, Llama, Mistral, etc.
## Demo
We provide a simple [demo](https://huggingface.co/spaces/sail/Sailor-14B-Chat) to chat with [Sailor-14B-Chat](https://huggingface.co/sail/Sailor-14B-Chat) .
You can also develop it using the provided [demo code](https://github.com/sail-sg/sailor-llm/tree/main/demo).## Citing this work
If you use this repository or sailor models, please cite
```
@article{dou2024sailor,
title={Sailor: Open Language Models for South-East Asia},
author={Dou, Longxu and Liu, Qian and Zeng, Guangtao and Guo, Jia and Zhou, Jiahui and Lu, Wei and Lin, Min},
journal={arXiv preprint arXiv:2404.03608},
year={2024}
}
```## Contact
If you have any questions, please raise an issue in our Github or contact us at [email protected] and [email protected].