{"id":17311308,"url":"https://github.com/dnth/vlhf","last_synced_at":"2025-03-27T01:13:37.293Z","repository":{"id":250823254,"uuid":"835554104","full_name":"dnth/vlhf","owner":"dnth","description":"Visual Layer \u003c-\u003e Hugging Face integration for data in/out.","archived":false,"fork":false,"pushed_at":"2024-11-26T14:50:13.000Z","size":5528,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-25T18:12:25.266Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dnth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-30T04:38:34.000Z","updated_at":"2024-11-26T14:51:16.000Z","dependencies_parsed_at":"2024-08-12T15:02:10.854Z","dependency_job_id":null,"html_url":"https://github.com/dnth/vlhf","commit_stats":null,"previous_names":["dnth/vl-hf-workflow","dnth/vlhf"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnth%2Fvlhf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnth%2Fvlhf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnth%2Fvlhf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dnth%2Fvlhf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dnth","download_url":"https://codeload.github.com/dnth/vlhf/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245761297,"owners_count":20667895,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-15T12:40:07.332Z","updated_at":"2025-03-27T01:13:37.276Z","avatar_url":"https://github.com/dnth.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# VLHF: Visual Layer - Hugging Face Integration\n\n![image](assets/vlhf.jpg)\n\nVLHF (Visual Layer - Hugging Face) is a Python package that provides a seamless interface for transferring datasets between Visual Layer and Hugging Face.\n\n## Features\n\n- Download/Upload datasets from Hugging Face to Visual Layer.\n- Download/Upload datasets from Visual Layer to Hugging Face.\n- Search for datasets on Hugging Face.\n\n\n## Installation\n\n### Prerequisites\nPython 3.10 or higher is required.\n\nBefore installing VLHF, you need to install the vl-research package:\n\n```bash\ngit clone https://github.com/visual-layer/vl-research\ncd vl-research\npip install -e .\n```\n\n### Install vlhf\nTo install the vlhf package, run:\n\n```\npip install -e .\n```\n\n## Usage\n\nAuthentication\n\n```python\nfrom vlhf.hugging_face import HuggingFace\nfrom vlhf.visual_layer import VisualLayer\n\nhf = HuggingFace(HF_TOKEN)\nvl = VisualLayer(VL_USER_ID, VL_ENV, VL_PG_URI)\n```\nList dataset on Hugging Face with the search term \"visual\"\n\n```python\nhf.list_datasets(search=\"visual\")\n```\n\n\u003ctable\u003e\n    \u003ctr\u003e\n        \u003cth\u003eid\u003c/th\u003e\n        \u003ctd\u003eauthor\u003c/td\u003e\n        \u003ctd\u003esha\u003c/td\u003e\n        \u003ctd\u003ecreated_at\u003c/td\u003e\n        \u003ctd\u003eprivate\u003c/td\u003e\n        \u003ctd\u003edownloads\u003c/td\u003e\n        \u003ctd\u003elikes\u003c/td\u003e\n        \u003ctd\u003etags\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003cth\u003e0\u003c/th\u003e\n        \u003ctd\u003evisual-layer/oxford-iiit-pet-vl-enriched\u003c/td\u003e\n        \u003ctd\u003eb4a70383...\u003c/td\u003e\n        \u003ctd\u003e2024-07-04 06:15:06\u003c/td\u003e\n        \u003ctd\u003eFalse\u003c/td\u003e\n        \u003ctd\u003e290\u003c/td\u003e\n        \u003ctd\u003e4\u003c/td\u003e\n        \u003ctd\u003etask_categories:image-classification, task_cat...\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003cth\u003e1\u003c/th\u003e\n        \u003ctd\u003evisual-layer/imagenet-1k-vl-enriched\u003c/td\u003e\n        \u003ctd\u003e45107c4f...\u003c/td\u003e\n        \u003ctd\u003e2024-07-09 08:56:33\u003c/td\u003e\n        \u003ctd\u003eFalse\u003c/td\u003e\n        \u003ctd\u003e393\u003c/td\u003e\n        \u003ctd\u003e6\u003c/td\u003e\n        \u003ctd\u003etask_categories:object-detection, task_categor...\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003cth\u003e2\u003c/th\u003e\n        \u003ctd\u003ejuletxara/visual-spatial-reasoning\u003c/td\u003e\n        \u003ctd\u003ea07bec7a...\u003c/td\u003e\n        \u003ctd\u003e2022-08-11 12:56:58\u003c/td\u003e\n        \u003ctd\u003eFalse\u003c/td\u003e\n        \u003ctd\u003e6\u003c/td\u003e\n        \u003ctd\u003e4\u003c/td\u003e\n        \u003ctd\u003etask_categories:image-classification, annotati...\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003cth\u003e3\u003c/th\u003e\n        \u003ctd\u003ealbertvillanova/visual-spatial-reasoning\u003c/td\u003e\n        \u003ctd\u003ecbe3e224...\u003c/td\u003e\n        \u003ctd\u003e2022-12-14 11:31:30\u003c/td\u003e\n        \u003ctd\u003eFalse\u003c/td\u003e\n        \u003ctd\u003e0\u003c/td\u003e\n        \u003ctd\u003e4\u003c/td\u003e\n        \u003ctd\u003etask_categories:image-classification, annotati...\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003cth\u003e4\u003c/th\u003e\n        \u003ctd\u003eFastJobs/Visual_Emotional_Analysis\u003c/td\u003e\n        \u003ctd\u003e31541d6d...\u003c/td\u003e\n        \u003ctd\u003e2023-03-03 06:23:19\u003c/td\u003e\n        \u003ctd\u003eFalse\u003c/td\u003e\n        \u003ctd\u003e272\u003c/td\u003e\n        \u003ctd\u003e10\u003c/td\u003e\n        \u003ctd\u003etask_categories:image-classification, language...\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003cth\u003e5\u003c/th\u003e\n        \u003ctd\u003ealitourani/moviefeats_visual\u003c/td\u003e\n        \u003ctd\u003eba9c47d7...\u003c/td\u003e\n        \u003ctd\u003e2024-05-10 17:16:19\u003c/td\u003e\n        \u003ctd\u003eFalse\u003c/td\u003e\n        \u003ctd\u003e0\u003c/td\u003e\n        \u003ctd\u003e1\u003c/td\u003e\n        \u003ctd\u003etask_categories:feature-extraction, task_categ...\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\n### From HF to VL\n\nDownload a dataset from Hugging Face\n\n```python\n# for image classification\nhf.download_dataset(dataset_id=\"lewtun/dog_food\", image_key=\"image\", label_key=\"label\")\n\n# for object detection\nhf.download_dataset(\"rishitdagli/cppe-5\", \n                    image_key=\"image\", \n                    bbox_key=\"objects\", \n                    bbox_label_names=[\"coverall\", \"face_shield\", \"gloves\", \"goggles\", \"mask\"])\n```\nParameters:\n+ `dataset_id`: The dataset ID on Hugging Face datasets.\n+ `image_key`: The column name in the dataset that contains PIL images.\n+ `label_key`: The column name containing image classification labels.\n+ `bbox_key` (Optional): The column name containing object detection bounding boxes.\n+ `bbox_label_names` (Optional): A list of object detection label names.\n+ `num_images` (Optional): The top N number of images to download.\n\n\n\u003e [!WARNING]  \n\u003e Not all datasets use `\"image\"`, `\"label\"`, or `\"objects\"` as their column names. Adjust these parameters based on the specific dataset structure.\n\u003e Currently only the COCO object detection annotation is supported. For example here's a sample row in the dataset:\n\u003e ```python\n\u003e { \"id\": [ 114, 115, 116, 117 ], \n\u003e   \"area\": [ 3796, 1596, 152768, 81002 ], \n\u003e   \"bbox\": [ \n\u003e             [ 302, 109, 73, 52 ], \n\u003e             [ 810, 100, 57, 28 ], \n\u003e             [ 160, 31, 248, 616 ], \n\u003e             [ 741, 68, 202, 401 ] \n\u003e           ], \n\u003e   \"category\": [ 4, 4, 0, 0 ] \n\u003e }\n\u003e ```\n\u003e The annotations are in the format of COCO dataset annotations. The `bbox` key contains the bounding box coordinates in the format `[x, y, width, height]` and the `category` key contains the category ID of the object.\n\u003e \n\u003e See more - https://huggingface.co/datasets/rishitdagli/cppe-5\n\nUpload to Visual Layer\n\n```python\nhf.to_vl(vl_session=vl)\n```\n\nParameters:\n+ `vl_session`: The authenticated Visual Layer session object.\n\n\n### From VL to HF\nGet dataset from Visual Layer\n\n```python\ndataset_id = \"124aa35a-4fd3-11ef-ab8c-7e1db6b41710\"\nvl.get_dataset(dataset_id) # returns a polars DataFrame\n```\n\n\u003ctable\u003e\n  \u003cthead\u003e\n    \u003ctr\u003e\n      \u003cth\u003eimage_uri\u003c/th\u003e\n      \u003cth\u003eimage_label\u003c/th\u003e\n      \u003cth\u003eimage_issues\u003c/th\u003e\n      \u003cth\u003eobject_labels\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003ctd\u003ehttps://d2iycffepdu1yp.cloudfront.net/273b1d8a...\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n      \u003ctd\u003e[{'label': 'enemy', 'bbox': [147, 201, 33, 111...\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003ehttps://d2iycffepdu1yp.cloudfront.net/273b1d8a...\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003ehttps://d2iycffepdu1yp.cloudfront.net/273b1d8a...\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n      \u003ctd\u003e[{'label': 'teammate', 'bbox': [144, 149, 11, ...\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003ehttps://d2iycffepdu1yp.cloudfront.net/273b1d8a...\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n      \u003ctd\u003eNone\u003c/td\u003e\n      \u003ctd\u003e[{'label': 'planted spike', 'bbox': [174, 149,...\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\nUpload to Hugging Face\n\n```python\nhf_repo_id = \"dnth/dog_food-vl-enriched\"\nvl.to_hf(hf_session=hf, hf_repo_id)\n```\n\nParameters:\n+ `hf_session`: The authenticated Hugging Face session object.\n\n\u003e [!NOTE]\n\u003e See the uploaded dataset on Hugging Face [here](https://huggingface.co/datasets/dnth/dog_food-vl-enriched).\n\n## Development\n\nInstall the development dependencies:\n\n```bash\npip install -r requirements-dev.txt\n```\n\nRun pre-commit to lint and format the code:\n\n```bash\npre-commit run --all-files\n```\n\nRun mypy to check for type errors:\n\n```bash\nmypy src/\n```\n\nRun pytest to run the tests:\n\n```bash\npytest tests/\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdnth%2Fvlhf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdnth%2Fvlhf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdnth%2Fvlhf/lists"}