https://github.com/dnth/vlhf
Visual Layer <-> Hugging Face integration for data in/out.
https://github.com/dnth/vlhf
Last synced: over 1 year ago
JSON representation
Visual Layer <-> Hugging Face integration for data in/out.
- Host: GitHub
- URL: https://github.com/dnth/vlhf
- Owner: dnth
- Created: 2024-07-30T04:38:34.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-11-26T14:50:13.000Z (over 1 year ago)
- Last Synced: 2025-03-25T18:12:25.266Z (over 1 year ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 5.27 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# VLHF: Visual Layer - Hugging Face Integration

VLHF (Visual Layer - Hugging Face) is a Python package that provides a seamless interface for transferring datasets between Visual Layer and Hugging Face.
## Features
- Download/Upload datasets from Hugging Face to Visual Layer.
- Download/Upload datasets from Visual Layer to Hugging Face.
- Search for datasets on Hugging Face.
## Installation
### Prerequisites
Python 3.10 or higher is required.
Before installing VLHF, you need to install the vl-research package:
```bash
git clone https://github.com/visual-layer/vl-research
cd vl-research
pip install -e .
```
### Install vlhf
To install the vlhf package, run:
```
pip install -e .
```
## Usage
Authentication
```python
from vlhf.hugging_face import HuggingFace
from vlhf.visual_layer import VisualLayer
hf = HuggingFace(HF_TOKEN)
vl = VisualLayer(VL_USER_ID, VL_ENV, VL_PG_URI)
```
List dataset on Hugging Face with the search term "visual"
```python
hf.list_datasets(search="visual")
```
id
author
sha
created_at
private
downloads
likes
tags
0
visual-layer/oxford-iiit-pet-vl-enriched
b4a70383...
2024-07-04 06:15:06
False
290
4
task_categories:image-classification, task_cat...
1
visual-layer/imagenet-1k-vl-enriched
45107c4f...
2024-07-09 08:56:33
False
393
6
task_categories:object-detection, task_categor...
2
juletxara/visual-spatial-reasoning
a07bec7a...
2022-08-11 12:56:58
False
6
4
task_categories:image-classification, annotati...
3
albertvillanova/visual-spatial-reasoning
cbe3e224...
2022-12-14 11:31:30
False
0
4
task_categories:image-classification, annotati...
4
FastJobs/Visual_Emotional_Analysis
31541d6d...
2023-03-03 06:23:19
False
272
10
task_categories:image-classification, language...
5
alitourani/moviefeats_visual
ba9c47d7...
2024-05-10 17:16:19
False
0
1
task_categories:feature-extraction, task_categ...
### From HF to VL
Download a dataset from Hugging Face
```python
# for image classification
hf.download_dataset(dataset_id="lewtun/dog_food", image_key="image", label_key="label")
# for object detection
hf.download_dataset("rishitdagli/cppe-5",
image_key="image",
bbox_key="objects",
bbox_label_names=["coverall", "face_shield", "gloves", "goggles", "mask"])
```
Parameters:
+ `dataset_id`: The dataset ID on Hugging Face datasets.
+ `image_key`: The column name in the dataset that contains PIL images.
+ `label_key`: The column name containing image classification labels.
+ `bbox_key` (Optional): The column name containing object detection bounding boxes.
+ `bbox_label_names` (Optional): A list of object detection label names.
+ `num_images` (Optional): The top N number of images to download.
> [!WARNING]
> Not all datasets use `"image"`, `"label"`, or `"objects"` as their column names. Adjust these parameters based on the specific dataset structure.
> Currently only the COCO object detection annotation is supported. For example here's a sample row in the dataset:
> ```python
> { "id": [ 114, 115, 116, 117 ],
> "area": [ 3796, 1596, 152768, 81002 ],
> "bbox": [
> [ 302, 109, 73, 52 ],
> [ 810, 100, 57, 28 ],
> [ 160, 31, 248, 616 ],
> [ 741, 68, 202, 401 ]
> ],
> "category": [ 4, 4, 0, 0 ]
> }
> ```
> The annotations are in the format of COCO dataset annotations. The `bbox` key contains the bounding box coordinates in the format `[x, y, width, height]` and the `category` key contains the category ID of the object.
>
> See more - https://huggingface.co/datasets/rishitdagli/cppe-5
Upload to Visual Layer
```python
hf.to_vl(vl_session=vl)
```
Parameters:
+ `vl_session`: The authenticated Visual Layer session object.
### From VL to HF
Get dataset from Visual Layer
```python
dataset_id = "124aa35a-4fd3-11ef-ab8c-7e1db6b41710"
vl.get_dataset(dataset_id) # returns a polars DataFrame
```
image_uri
image_label
image_issues
object_labels
https://d2iycffepdu1yp.cloudfront.net/273b1d8a...
None
None
[{'label': 'enemy', 'bbox': [147, 201, 33, 111...
https://d2iycffepdu1yp.cloudfront.net/273b1d8a...
None
None
None
https://d2iycffepdu1yp.cloudfront.net/273b1d8a...
None
None
[{'label': 'teammate', 'bbox': [144, 149, 11, ...
https://d2iycffepdu1yp.cloudfront.net/273b1d8a...
None
None
[{'label': 'planted spike', 'bbox': [174, 149,...
Upload to Hugging Face
```python
hf_repo_id = "dnth/dog_food-vl-enriched"
vl.to_hf(hf_session=hf, hf_repo_id)
```
Parameters:
+ `hf_session`: The authenticated Hugging Face session object.
> [!NOTE]
> See the uploaded dataset on Hugging Face [here](https://huggingface.co/datasets/dnth/dog_food-vl-enriched).
## Development
Install the development dependencies:
```bash
pip install -r requirements-dev.txt
```
Run pre-commit to lint and format the code:
```bash
pre-commit run --all-files
```
Run mypy to check for type errors:
```bash
mypy src/
```
Run pytest to run the tests:
```bash
pytest tests/
```