https://github.com/agora-lab-ai/bitnet-a4.8
BitNet a4.8 Implementation in one file of pytorch
https://github.com/agora-lab-ai/bitnet-a4.8
Last synced: 8 months ago
JSON representation
BitNet a4.8 Implementation in one file of pytorch
- Host: GitHub
- URL: https://github.com/agora-lab-ai/bitnet-a4.8
- Owner: Agora-Lab-AI
- License: mit
- Created: 2024-11-11T11:55:27.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-13T20:25:37.000Z (11 months ago)
- Last Synced: 2025-03-27T05:25:10.231Z (9 months ago)
- Language: Python
- Size: 22.5 KB
- Stars: 13
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# BitNet a4.8: 4-bit Activations for 1-bit LLMs
[](https://discord.gg/agora-999382051935506503) [](https://www.youtube.com/@kyegomez3242) [](https://www.linkedin.com/in/kye-g-38759a207/) [](https://x.com/kyegomezb)
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://pytorch.org/)
[](https://agoralab.xyz)
This repository contains an unofficial PyTorch implementation of [BitNet a4.8: 4-bit Activations for 1-bit LLMs](https://arxiv.org/abs/2411.04965) (Wang et al., 2024).
## π Paper Summary
BitNet a4.8 is a groundbreaking approach that enables 4-bit activations for 1-bit Large Language Models (LLMs). The method employs a hybrid quantization and sparsification strategy to mitigate quantization errors from outlier channels while maintaining model performance.
Key features:
- 4-bit quantization for attention and FFN inputs
- 8-bit quantization with sparsification for intermediate states
- Only 55% of parameters activated during inference
- Support for 3-bit KV cache
- Comparable performance to BitNet b1.58 with better inference efficiency
## π Implementation
This implementation includes:
```python
# Create a BitNet a4.8 model
model = create_model(
hidden_size=4096,
intermediate_size=11008,
num_hidden_layers=32,
num_attention_heads=32
)
```
Key components:
- RMSNorm for layer normalization
- 4-bit and 8-bit quantizers
- TopK sparsification
- BitLinear (1.58-bit weights)
- Hybrid attention mechanism
- Gated FFN with ReLUΒ²
## π¦ Installation
```bash
git clone https://github.com/yourusername/bitnet-a48
cd bitnet-a48
pip install -r requirements.txt
```
## π€ Join the Agora Community
This implementation is part of the [Agora](https://agoralab.xyz) initiative, where researchers and developers collaborate to implement cutting-edge ML papers. By joining Agora, you can:
- Collaborate with others on paper implementations
- Get early access to new research implementations
- Share your expertise and learn from others
- Contribute to open-source ML research
**[Join Agora Today](https://agoralab.xyz)**
## π Results
The implementation achieves performance comparable to BitNet b1.58 while enabling:
- 4-bit activation compression
- 45% parameter sparsity
- Reduced inference costs
- 3-bit KV cache support
## π οΈ Usage
```python
from bitnet_a48 import create_model
# Initialize model
model = create_model(
hidden_size=4096,
intermediate_size=11008,
num_hidden_layers=32,
num_attention_heads=32
)
# Forward pass
outputs = model(input_ids, attention_mask)
```
## π Training
The model uses a two-stage training recipe:
1. Train with 8-bit activations and ReLUΒ²GLU
2. Fine-tune with hybrid quantization and sparsification
## π€ Contributing
We welcome contributions! Please:
1. Fork the repository
2. Create a feature branch
3. Submit a pull request
Join the discussion on the [Agora Discord](https://agoralab.xyz/discord)!
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Acknowledgements
- Original paper authors: Hongyu Wang, Shuming Ma, Furu Wei
- The Agora community
- PyTorch team
- Open-source ML community
## π Citation
```bibtex
@article{wang2024bitnet,
title={BitNet a4.8: 4-bit Activations for 1-bit LLMs},
author={Wang, Hongyu and Ma, Shuming and Wei, Furu},
journal={arXiv preprint arXiv:2411.04965},
year={2024}
}
```
## π Links
- [Original Paper](https://arxiv.org/abs/2411.04965)
- [Agora Platform](https://agoralab.xyz)
- [Implementation Details](docs/IMPLEMENTATION.md)
- [Contributing Guide](docs/CONTRIBUTING.md)
Join us in implementing more cutting-edge ML research at [Agora](https://agoralab.xyz)!