https://github.com/dhakalnirajan/llama-bitnet

LLaMA-BitNet is a repository dedicated to empowering users to train their own BitNet models built upon LLaMA 2 model, inspired by the groundbreaking paper 'The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits'.
https://github.com/dhakalnirajan/llama-bitnet

large-language-models llama llama2 llm llms meta microsoft

Last synced: 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/dhakalnirajan/llama-bitnet
Owner: dhakalnirajan
License: mit
Created: 2024-03-30T08:44:54.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-03-31T06:05:54.000Z (about 1 year ago)
Last Synced: 2024-10-18T21:59:08.280Z (7 months ago)
Topics: large-language-models, llama, llama2, llm, llms, meta, microsoft
Language: Python
Homepage: https://arxiv.org/pdf/2402.17764
Size: 11.7 KB
Stars: 8
Watchers: 3
Forks: 1
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Welcome to LLaMA-BitNet

Welcome to the LLaMA-BitNet repository, where you can dive into the fascinating world of BitNet models. Our repository is your gateway to training your very own BitNet model, as highlighted in the groundbreaking paper [The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits](https://arxiv.org/abs/2402.17764). Built upon the cutting-edge [LLaMA 2](https://llama.meta.com) architecture, this project allows you to unleash the potential of a model wielding approximately 78 million parameters, trained on a staggering corpus of around 1.5 billion tokens.




> Note: You need to have access to LLaMA model if you wish to run code without modifications. To get access to LLaMA family of models, you need to go to [https://llama.meta.com/llama-downloads/](https://llama.meta.com/llama-downloads/) and provide credentials which you use in Hugging Face. After that, you will receive mail to either download weights directly to your device or to use LLaMA through API.




[![Python](https://img.shields.io/badge/Python-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/)

![PyTorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=for-the-badge&logo=PyTorch&logoColor=white)

[![Hugging Face](https://img.shields.io/badge/Hugging%20Face-transformers%20%7C%20datasets%20%7C%20Models-blueviolet?style=for-the-badge&logo=huggingface&labelColor=ee4c2c&color=ff9d00)](https://huggingface.co/)

![GitHub License](https://img.shields.io/github/license/dhakalnirajan/LLaMA-BitNet?style=for-the-badge&logo=github&logoColor=white&label=License&labelColor=purple&color=orange)

[![Repository Stars](https://img.shields.io/github/stars/dhakalnirajan/LLaMA-BitNet)](https://github.com/dhakalnirajan/LLaMA-BitNet/stargazers)

![GitHub forks](https://img.shields.io/github/forks/dhakalnirajan/LLaMA-BitNet)

[![Dataset on HF](https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-sm.svg)](https://huggingface.co/datasets)

[![Follow me on HF](https://huggingface.co/datasets/huggingface/badges/resolve/main/follow-me-on-HF-sm.svg)](https://huggingface.co/nirajandhakal)[![X (formerly Twitter) URL](https://img.shields.io/twitter/url?url=https%3A%2F%2Ftwitter.com%2Fnirajandhakal_7&style=for-the-badge&logo=X&logoColor=blue&label=Follow%20%40nirajandhakal_7%20&labelColor=black&color=black&link=https%3A%2F%2Ftwitter.com%2Fnirajandhakal_7)](https://twitter.com/nirajandhakal_7)



## Easy Installation

Getting started with LLaMA-BitNet is a breeze! Follow these simple steps to install all the necessary modules:

```shell

pip install -r requirements.txt

```

## Intuitive File Structure

Our repository boasts a clear and intuitive file structure designed for effortless navigation and customization:

```

LLaMA-BitNet                    (root folder)

|

│   ├── inference.py            (Run inference with the trained BitNet model)

│   ├── LICENSE                 (MIT License)

│   ├── README.md

│   ├── requirements.txt        (List of required modules for installation)

│   ├── train.py                (Run the training process)

│   └── utils.py                (Contains utility functions)

```

## Empowering Training Data

Harness the power of a 15% subset of the `OpenWebText2` dataset meticulously prepared for training. This subset, tokenized with a context length of 256 for seamless testing, offers unparalleled versatility. However, our code also facilitates manual tokenization, allowing you to train on datasets of your choice effortlessly.

## Streamlined Dependencies

We've curated a set of essential dependencies listed in the `requirements.txt` file, ensuring a seamless installation process:

```text

transformers

datasets

torch

wandb

huggingface_hub

```

## Unleash the Full Potential of BitNet

Our BitNet architecture is engineered for excellence, drawing inspiration from the meticulous design laid out in the training details manuscript, [The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf](https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf). By seamlessly integrating BitLinear and leveraging HuggingFace's `LlamaForCasualLM`, we empower you to unlock the true power of BitNet.

Explore, train, and revolutionize with LLaMA-BitNet!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dhakalnirajan/llama-bitnet

Awesome Lists containing this project

README