https://github.com/voyager466920/koraptor
🚀 150M Language Model with Latent MoE architecture using a SINGLE GPU from scratch.
https://github.com/voyager466920/koraptor
mixture-of-experts moe small-language-model
Last synced: about 2 months ago
JSON representation
🚀 150M Language Model with Latent MoE architecture using a SINGLE GPU from scratch.
- Host: GitHub
- URL: https://github.com/voyager466920/koraptor
- Owner: Voyager466920
- Created: 2025-04-24T11:53:21.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-08-05T02:28:51.000Z (2 months ago)
- Last Synced: 2025-08-05T04:11:33.735Z (2 months ago)
- Topics: mixture-of-experts, moe, small-language-model
- Language: Python
- Homepage: https://youtu.be/USPKsNLCRqE?si=AAiD-9Clo-IJnduv
- Size: 578 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# KoRaptor 150M
### 150M parameter model *pretrained / fine-tuned from SCRATCH* using *SINGLE GPU*(RTX 3090)
![]()
![]()
## Key Features
- **Language:** Pure Korean
- **Architecture:** LatentMoE
- **Parameters:** 150 million
- **Base_model**: Voyager466920/KoRaptor- **Use case:** Conversational AI / Chatbot
- **Dataset:** Korean chatbot dataset from AI Hub
- **License:** Follows the license of the original dataset and model architecture## Usage
You can easily run inferences using the provided `Inference.py` script. No additional setup is required — simply load the model and start chatting in Korean.
This model is incompatible with standard transformer loading methods (e.g., AutoModel). For simple inference, use the following code.```python
from huggingface_hub import hf_hub_download
import runpyscript_path = hf_hub_download(
repo_id="Voyager466920/KoRaptor_Chatbot",
filename="Inference.py"
)
```