https://github.com/ai-hypercomputer/kithara
https://github.com/ai-hypercomputer/kithara
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/ai-hypercomputer/kithara
- Owner: AI-Hypercomputer
- License: apache-2.0
- Created: 2025-02-06T22:20:29.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-19T19:14:00.000Z (9 months ago)
- Last Synced: 2025-06-24T05:04:36.640Z (8 months ago)
- Language: Python
- Size: 721 KB
- Stars: 14
- Watchers: 1
- Forks: 6
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Kithara - Easy Finetuning on TPUs
[](https://pypi.org/project/kithara/)
[](https://github.com/AI-Hypercomputer/kithara/pulls)
[](https://github.com/AI-Hypercomputer/kithara/commits/main)
[](https://kithara.readthedocs.io/en/latest/)
## 👋 Overview
Kithara is a lightweight library offering building blocks and recipes for tuning popular open source LLMs including Gemma2 and Llama3 on Google TPUs.
It provides:
- **Frictionless scaling**: Distributed training abstractions intentionally built with simplicity in mind.
- **Multihost training support**: Integration with Ray, GCE and GKE.
- **Async, distributed checkpointing**: Multi-host & Multi-device checkpointing via Orbax.
- **Distributed, streamed dataloading**: Per-process, streamed data loading via Ray.data.
- **GPU/TPU fungibility**: Same code works for both GPU and TPU out of the box.
- **Native integration with HuggingFace**: Tune and save models in HuggingFace format.
**New to TPUs?**
Using TPUs provides significant advantages in terms of performance, cost-effectiveness, and scalability, enabling faster training times and the ability to work with larger models and datasets. Check out our onboarding guide to [getting TPUs](https://kithara.readthedocs.io/en/latest/getting_tpus.html).
## 🔗 **Key links and resources**
| | |
| --------------------------------- | --------------------------------------------------------------------------------------------------------------------------- |
| 📚 **Documentation** | [Read Our Docs](https://kithara.readthedocs.io/en/latest/) |
| 💾 **Installation** | [Quick Pip Install](https://kithara.readthedocs.io/en/latest/installation.html) |
| ✏️ **Get Started** | [Intro to Kithara](https://kithara.readthedocs.io/en/latest/quickstart.html) |
| 🌟 **Supported Models** | [List of Models](https://kithara.readthedocs.io/en/latest/models.html) |
| 🌐 **Supported Datasets** | [List of Data Formats](https://kithara.readthedocs.io/en/latest/datasets.html) |
| ⌛️ **Performance Optimizations** | [Our Memory and Throughput Optimizations](https://kithara.readthedocs.io/en/latest/optimizations.html) |
| 📈 **Scaling up** | [Guide for Tuning Large Models](https://kithara.readthedocs.io/en/latest/scaling_with_ray.html) |
## 🌵 **Examples**
- **Quick Start Colab Notebook**: [SFT + LoRA with Gemma2-2b](https://colab.sandbox.google.com/github/AI-Hypercomputer/kithara/blob/main/examples/colab/SFT_with_LoRA_Gemma2-2b.ipynb)
- **SFT + LoRA**: [Step by Step Example](https://kithara.readthedocs.io/en/latest/sft.html)
- **Continued Pretraining**: [Step by Step Example](https://kithara.readthedocs.io/en/latest/pretraining.html)