https://github.com/mahmood-anaam/violet2

Violet is a Python-based library designed for generating Arabic image captions. The pipeline leverages state-of-the-art transformer models, providing an easy-to-use interface for researchers and developers working on tasks such as image captioning and visual question answering (VQA).
https://github.com/mahmood-anaam/violet2

image-captioning okvqa python3 pytorch transformers vqa vqav2

Last synced: 8 months ago
JSON representation

Host: GitHub
URL: https://github.com/mahmood-anaam/violet2
Owner: Mahmood-Anaam
License: mit
Created: 2024-12-08T15:59:38.000Z (11 months ago)
Default Branch: main
Last Pushed: 2025-01-03T20:10:00.000Z (10 months ago)
Last Synced: 2025-01-03T20:31:31.501Z (10 months ago)
Topics: image-captioning, okvqa, python3, pytorch, transformers, vqa, vqav2
Language: Jupyter Notebook
Homepage: https://github.com/Mahmood-Anaam/Violet.git
Size: 12.5 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Violet: Arabic Image Captioning

**Violet** is a Python-based library designed for generating **Arabic image captions**. The pipeline leverages state-of-the-art transformer models, providing an easy-to-use interface for researchers and developers working on tasks such as image captioning and visual question answering (VQA).

## Features
1. **Arabic Image Captioning**: Generate high-quality captions for images in Arabic.
2. **Visual Feature Extraction**: Extract image features for integration into vision-language models or downstream tasks.
3. **Customizable for VQA**: Use extracted features and captions to build Arabic visual question-answering systems.
4. **Mixed Input Support**: Handle batches of images in various formats, such as URLs, file paths, NumPy arrays, PyTorch tensors, and PIL Image objects.

## How to Use Violet

### Installation
Clone the repository and install Violet in editable mode:
```bash
!git clone https://github.com/Mahmood-Anaam/Violet.git
%cd Violet
!pip install -e .
```

### Example Usage in Google Colab
Interactive Jupyter notebooks are provided to demonstrate Violet's capabilities. You can open these notebooks in Google Colab:

- [Image Captioning Demo](https://github.com/Mahmood-Anaam/Violet/blob/main/notebooks/inference_demo.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Mahmood-Anaam/Violet/blob/main/notebooks/inference_demo.ipynb)
- [Feature Extraction Demo](https://github.com/Mahmood-Anaam/Violet/blob/main/notebooks/features_extraction_demo.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Mahmood-Anaam/Violet/blob/main/notebooks/features_extraction_demo.ipynb)

### Pipeline Overview

The Violet pipeline supports three main functionalities:

1. **Generate Captions for Images**
The pipeline can handle a variety of input formats
```python

from violet.pipeline import VioletImageCaptioningPipeline
from violet.configuration import VioletConfig

pipeline = VioletImageCaptioningPipeline(VioletConfig)

# Single image captioning
captions = pipeline("http://images.cocodataset.org/val2017/000000039769.jpg")
print(captions)

# Batch image captioning with mixed formats
images = [
"http://images.cocodataset.org/val2017/000000039769.jpg",
"/path/to/local/image.jpg",
np.random.rand(224, 224, 3), # NumPy array
torch.randn(3, 224, 224), # PyTorch tensor
Image.open("/path/to/pil/image.jpg"), # PIL Image
]

captions = pipeline(images)
for caption in captions:
print(caption)

```

2. **Extract Features from Images**
Extract visual features for downstream tasks like VQA. The pipeline supports mixed input formats in a single batch.
```python
# Single image feature extraction
features = pipeline.generate_features("http://images.cocodataset.org/val2017/000000039769.jpg")
print(features.shape)

# Batch feature extraction with mixed formats
features = pipeline.generate_features(images)
print(features.shape)
```

4. **Generate Captions from Features**
Generate captions based on precomputed visual features.
```python
captions = pipeline.generate_captions_from_features(features)
for caption in captions:
print(caption)
```
## Contributions
**Violet** is a library for Arabic image captioning and visual feature extraction, designed for tasks like image captioning and visual question answering (VQA). Contributions are welcome on the [GitHub Repository](https://github.com/Mahmood-Anaam/Violet).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mahmood-anaam/violet2

Awesome Lists containing this project

README