https://github.com/autodistill/autodistill-paligemma
Use PaliGemma to auto-label data for use in training fine-tuned vision models.
https://github.com/autodistill/autodistill-paligemma
autodistill computer-vision fine-tuning-computer-vision paligemma zero-shot-object-detection
Last synced: 2 months ago
JSON representation
Use PaliGemma to auto-label data for use in training fine-tuned vision models.
- Host: GitHub
- URL: https://github.com/autodistill/autodistill-paligemma
- Owner: autodistill
- Created: 2024-05-15T08:45:37.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-13T12:51:58.000Z (about 1 year ago)
- Last Synced: 2025-04-14T12:13:06.760Z (2 months ago)
- Topics: autodistill, computer-vision, fine-tuning-computer-vision, paligemma, zero-shot-object-detection
- Language: Python
- Homepage: https://docs.autodistill.com
- Size: 31.3 KB
- Stars: 12
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Autodistill PaLiGemma Module
This repository contains the code supporting the PaLiGemma base model for use with [Autodistill](https://github.com/autodistill/autodistill).
[PaLiGemma](https://blog.roboflow.com/paligemma-multimodal-vision/), developed by Google, is a computer vision model trained using pairs of images and text. You can label data with PaliGemma models for use in training smaller, fine-tuned models with Autodisitll.
Read the full [Autodistill documentation](https://autodistill.github.io/autodistill/).
## Installation
To use PaLiGemma with autodistill, you need to install the following dependency:
```bash
pip3 install autodistill-paligemma
```## Quickstart
### Auto-label with an existing model
```python
from autodistill_paligemma import PaliGemma# define an ontology to map class names to our PaliGemma prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = PaliGemma(
ontology=CaptionOntology(
{
"person": "person",
"a forklift": "forklift"
}
)
)# label a single image
result = PaliGemma.predict("test.jpeg")
print(result)# label a folder of images
base_model.label("./context_images", extension=".jpeg")
```### Model fine-tuning
You can fine-tune PaliGemma models with LoRA for deployment with [Roboflow Inference](https://inference.roboflow.com).
To train a model, use this code:
```python
from autodistill_paligemma import PaLiGemmaTrainertarget_model = PaLiGemmaTrainer()
# train a model
target_model.train("./data/")
```## License
The model weights for PaLiGemma are licensed under a custom Google license. To learn more, refer to the [Google Gemma Terms of Use](https://ai.google.dev/gemma/terms).
## 🏆 Contributing
We love your input! Please see the core Autodistill [contributing guide](https://github.com/autodistill/autodistill/blob/main/CONTRIBUTING.md) to get started. Thank you 🙏 to all our contributors!