https://github.com/eljandoubi/paligemma
Coding PaliGemma from scratch using pytorch for inference.
https://github.com/eljandoubi/paligemma
pytorch-implementation transformers vision-language-model
Last synced: about 2 months ago
JSON representation
Coding PaliGemma from scratch using pytorch for inference.
- Host: GitHub
- URL: https://github.com/eljandoubi/paligemma
- Owner: eljandoubi
- License: apache-2.0
- Created: 2024-08-17T07:18:20.000Z (9 months ago)
- Default Branch: master
- Last Pushed: 2025-03-08T03:31:18.000Z (about 2 months ago)
- Last Synced: 2025-03-08T04:25:51.224Z (about 2 months ago)
- Topics: pytorch-implementation, transformers, vision-language-model
- Language: Python
- Homepage:
- Size: 1.11 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Coding PaliGemma from scratch using pytorch for inference.
## Setup environment
* Clone the repository and Go to PaliGemma directory.
```bash
git clone https://github.com/eljandoubi/PaliGemma.git && cd PaliGemma
```* Build environment.
```bash
make build
```## Run inference
* Default test case.
```bash
make run
```* Costumized tests
You can change these variables: `PROMPT` and `IMAGE_FILE_PATH` in order to run on your own test case.
```bash
make run PROMPT="this building is " IMAGE_FILE_PATH="sample/EiffelTower.jpg"
```## Clean environment
```bash
make clean
```