https://github.com/tomsanbear/bitnet-rs

Implementing the BitNet model in Rust
https://github.com/tomsanbear/bitnet-rs

bitnet candle llm rust

Last synced: 3 months ago
JSON representation

Implementing the BitNet model in Rust

Host: GitHub
URL: https://github.com/tomsanbear/bitnet-rs
Owner: tomsanbear
License: mit
Created: 2024-02-29T20:02:10.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-04-18T19:14:40.000Z (over 1 year ago)
Last Synced: 2025-04-22T17:44:08.954Z (6 months ago)
Topics: bitnet, candle, llm, rust
Language: Rust
Homepage:
Size: 155 KB
Stars: 31
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# bitnet-rs: Bitnet Transformer in Rust!

Implementation of the Bitnet transformer using [Candle](https://github.com/huggingface/candle). Implementation is based on the pytorch implementation here: [kyegomez/BitNet](https://github.com/kyegomez/BitNet)

## About

I started this project in order to better understand what goes into making a transformer model in a ML library from scratch, rather than re-implement an existing model I wanted to try doing this from a less known and unimplemented model. In addition, I'm curious about non pytorch based models in order to push performance for models, as such learning to use Candle was a big part of this!

## Building

### CPU

`cargo build --release`

### Metal

`cargo build --release --features "metal,accelerate"`

### CUDA

`cargo build --release --features "cuda"`

## Training

First, build the binary according to the instructions above, then run the command below.

`./target/release/bitnet-rs train --dataset ""`

Replace `` with the directory location of the dataset you are training from. These must be precompiled datasets. I would recommend using the same dataset that has been used for validation: [karpathy/llama2.c](https://github.com/karpathy/llama2.c?tab=readme-ov-file#training). Please follow the instructions in that repository for generating the pretokenized dataset.

For example, on my machine the training command is this: `./target/release/bitnet-rs train --dataset "../../karpathy/llama2.c/data/TinyStories_all_data"`.

## Inference

First, build the binary according to the instructions above, then run the command below.

`./target/release/bitnet-rs inference`

If you want to provide a prompt, provide the `--prompt` flag.

`./target/release/bitnet-rs inference --prompt "Once upon a time "`

If you want to specify a specific model to use for the inference, use the `--pretrained-model-path` flag.

`./target/release/bitnet-rs inference --pretrained-model-path "./checkpoint.safetensors"`.

## Known Issues

I'm still testing this out but I am getting semi coherent output with models I've trained. Definitely not useful for any task right now until I can get loss down.

## Contributing

If you have an interest in contributing please feel free! I'm still learning and would appreciate any suggestions from others.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tomsanbear/bitnet-rs

Awesome Lists containing this project

README