An open API service indexing awesome lists of open source software.

https://github.com/voyager466920/koraptor

🚀 150M Language Model with Latent MoE architecture using a SINGLE GPU from scratch.
https://github.com/voyager466920/koraptor

mixture-of-experts moe small-language-model

Last synced: about 2 months ago
JSON representation

🚀 150M Language Model with Latent MoE architecture using a SINGLE GPU from scratch.

Awesome Lists containing this project

README

          

# KoRaptor 150M
### 150M parameter model *pretrained / fine-tuned from SCRATCH* using *SINGLE GPU*(RTX 3090)

Hugging Face Model


Youtube video

## Key Features
- **Language:** Pure Korean
- **Architecture:** LatentMoE
- **Parameters:** 150 million
- **Base_model**: Voyager466920/KoRaptor

- **Use case:** Conversational AI / Chatbot
- **Dataset:** Korean chatbot dataset from AI Hub
- **License:** Follows the license of the original dataset and model architecture

## Usage
You can easily run inferences using the provided `Inference.py` script. No additional setup is required — simply load the model and start chatting in Korean.
This model is incompatible with standard transformer loading methods (e.g., AutoModel). For simple inference, use the following code.

```python
from huggingface_hub import hf_hub_download
import runpy

script_path = hf_hub_download(
repo_id="Voyager466920/KoRaptor_Chatbot",
filename="Inference.py"
)
```