{"id":32647570,"url":"https://github.com/jacobmarks/zero-shot-prediction-plugin","last_synced_at":"2025-10-31T05:55:33.467Z","repository":{"id":196284655,"uuid":"691195419","full_name":"jacobmarks/zero-shot-prediction-plugin","owner":"jacobmarks","description":"Run zero-shot prediction models on your data","archived":false,"fork":false,"pushed_at":"2024-04-13T18:22:02.000Z","size":92,"stargazers_count":23,"open_issues_count":1,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-04-14T09:09:21.554Z","etag":null,"topics":["classification","clip","computer-vision","detection","fiftyone","huggingface","owl-vit","plugin","python","segment-anything","segmentation","yolo-world","zero-shot-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jacobmarks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-13T17:26:02.000Z","updated_at":"2024-05-30T22:10:27.284Z","dependencies_parsed_at":"2023-09-24T04:05:24.772Z","dependency_job_id":"b7d03ef8-40d3-4801-8d12-5c079ec7e15a","html_url":"https://github.com/jacobmarks/zero-shot-prediction-plugin","commit_stats":null,"previous_names":["jacobmarks/zero-shot-prediction-plugin"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jacobmarks/zero-shot-prediction-plugin","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Fzero-shot-prediction-plugin","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Fzero-shot-prediction-plugin/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Fzero-shot-prediction-plugin/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Fzero-shot-prediction-plugin/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jacobmarks","download_url":"https://codeload.github.com/jacobmarks/zero-shot-prediction-plugin/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Fzero-shot-prediction-plugin/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281937758,"owners_count":26586774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-31T02:00:07.401Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","clip","computer-vision","detection","fiftyone","huggingface","owl-vit","plugin","python","segment-anything","segmentation","yolo-world","zero-shot-learning"],"created_at":"2025-10-31T05:55:27.126Z","updated_at":"2025-10-31T05:55:33.459Z","avatar_url":"https://github.com/jacobmarks.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Zero Shot Prediction Plugin\n\n![zero_shot_owlvit_example](https://github.com/jacobmarks/zero-shot-prediction-plugin/assets/12500356/6aca099a-17b3-4f85-955d-26c3951f0646)\n\nThis plugin allows you to perform zero-shot prediction on your dataset for the following tasks:\n\n- Image Classification\n- Object Detection\n- Instance Segmentation\n- Semantic Segmentation\n\nGiven a list of label classes, which you can input either manually, separated by commas, or by uploading a text file, the plugin will perform zero-shot prediction on your dataset for the specified task and add the results to the dataset under a new field, which you can specify.\n\n### Updates\n- 🆕 **2024-12-03**: Added support for Apple AIMv2 Zero Shot Model (courtesy of [@harpreetsahota204](https://github.com/harpreetsahota204))\n- 🆕 **2024-12-16**: Added MPS and GPU support for ALIGN, AltCLIP, Apple AIMv2 (courtesy of [@harpreetsahota204](https://github.com/harpreetsahota204))\n- **2024-06-22**: Updated interface for Python operator execution\n- **2024-05-30**: Added\n  - support for Grounding DINO for object detection and instance segmentation\n  - confidence thresholding for object detection and instance segmentation\n- **2024-03-06**: Added support for YOLO-World for object detection and instance segmentation!\n- **2024-01-10**: Removing LAION CLIP models.\n- **2024-01-05**: Added support for EVA-CLIP, SigLIP, and DFN CLIP for image classification!\n- **2023-11-28**: Version 1.1.1 supports OpenCLIP for image classification!\n- **2023-11-13**: Version 1.1.0 supports [calling operators from the Python SDK](#python-sdk)!\n- **2023-10-27**: Added support for MetaCLIP for image classification\n- **2023-10-20**: Added support for AltCLIP and Align for image classification and GroupViT for semantic segmentation\n\n### Requirements\n\n- To use YOLO-World models, you must have `\"ultalytics\u003e=8.1.42\"`.\n\n## Models\n\n### Built-in Models\n\nAs a starting point, this plugin comes with at least one zero-shot model per task. These are:\n\n#### Image Classification\n\n- [ALIGN](https://huggingface.co/docs/transformers/model_doc/align)\n- [AltCLIP](https://huggingface.co/docs/transformers/model_doc/altclip)\n- 🆕 [Apple AIMv2](https://huggingface.co/apple/aimv2-large-patch14-224-lit)\n- [CLIP](https://github.com/openai/CLIP): (OpenAI)\n- [CLIPA](https://github.com/UCSC-VLAA/CLIPA)\n- [DFN CLIP](https://huggingface.co/apple/DFN5B-CLIP-ViT-H-14-378): Data Filtering Networks\n- [EVA-CLIP](https://huggingface.co/QuanSun/EVA-CLIP)\n- [MetaCLIP](https://github.com/facebookresearch/metaclip)\n- [SigLIP](https://huggingface.co/timm/ViT-SO400M-14-SigLIP-384)\n\n#### Object Detection\n\n- [YOLO-World](https://docs.ultralytics.com/models/yolo-world/)\n- [Owl-ViT](https://huggingface.co/docs/transformers/model_doc/owlvit)\n- [Grounding DINO](https://huggingface.co/docs/transformers/main/en/model_doc/grounding-dino)\n\n#### Instance Segmentation\n\n- [Owl-ViT](https://huggingface.co/docs/transformers/model_doc/owlvit) + [Segment Anything (SAM)](https://github.com/facebookresearch/segment-anything)\n- [YOLO-World](https://docs.ultralytics.com/models/yolo-world/) + [Segment Anything (SAM)](https://github.com/facebookresearch/segment-anything)\n- [Grounding DINO](https://huggingface.co/docs/transformers/main/en/model_doc/grounding-dino) + [Segment Anything (SAM)](https://github.com/facebookresearch/segment-anything)\n\n#### Semantic Segmentation\n\n- [CLIPSeg](https://huggingface.co/blog/clipseg-zero-shot)\n- [GroupViT](https://huggingface.co/docs/transformers/model_doc/groupvit)\n\nMost of the models used are from the [HuggingFace Transformers](https://huggingface.co/transformers/) library, and CLIP and SAM models are from the [FiftyOne Model Zoo](https://docs.voxel51.com/user_guide/model_zoo/index.html)\n\n_Note_— For SAM you will need to have Facebook's `segment-anything` library installed.\n\n### Adding Your Own Models\n\nYou can see the implementations for all of these models in the following files:\n\n- `classification.py`\n- `detection.py`\n- `instance_segmentation.py`\n- `semantic_segmentation.py`\n\nThese models are \"registered\" via dictionaries in each file. In `semantic_segmentation.py`, for example, the dictionary is:\n\n```py\nSEMANTIC_SEGMENTATION_MODELS = {\n    \"CLIPSeg\": {\n        \"activator\": CLIPSeg_activator,\n        \"model\": CLIPSegZeroShotModel,\n        \"name\": \"CLIPSeg\",\n    },\n    \"GroupViT\": {\n        \"activator\": GroupViT_activator,\n        \"model\": GroupViTZeroShotModel,\n        \"name\": \"GroupViT\",\n    },\n}\n```\n\nThe `activator` checks the environment to see if the model is available, and the `model` is a `fiftyone.core.models.Model` object that is instantiated with the model name and the task — or a function that instantiates such a model. The `name` is the name of the model that will be displayed in the dropdown menu in the plugin.\n\nIf you want to add your own model, you can add it to the dictionary in the corresponding file. For example, if you want to add a new semantic segmentation model, you can add it to the `SEMANTIC_SEGMENTATION_MODELS` dictionary in `semantic_segmentation.py`:\n\n```py\nCLASSIFICATION_MODELS = {\n    \"CLIPSeg\": {\n        \"activator\": CLIPSeg_activator,\n        \"model\": CLIPSegZeroShotModel,\n        \"name\": \"CLIPSeg\",\n    },\n    \"GroupViT\": {\n        \"activator\": GroupViT_activator,\n        \"model\": GroupViTZeroShotModel,\n        \"name\": \"GroupViT\",\n    },\n    ..., # other models\n    \"My Model\": {\n        \"activator\": my_model_activator,\n        \"model\": my_model,\n        \"name\": \"My Model\",\n    }\n}\n```\n\n💡 You need to implement the `activator` and `model` functions for your model. The `activator` should check the environment to see if the model is available, and the `model` should be a `fiftyone.core.models.Model` object that is instantiated with the model name and the task.\n\n## Watch On Youtube\n\n[![Video Thumbnail](https://img.youtube.com/vi/GlwyFHbTklw/0.jpg)](https://www.youtube.com/watch?v=GlwyFHbTklw\u0026list=PLuREAXoPgT0RZrUaT0UpX_HzwKkoB-S9j\u0026index=7)\n\n## Installation\n\n```shell\nfiftyone plugins download https://github.com/jacobmarks/zero-shot-prediction-plugin\n```\n\nIf you want to use AltCLIP, Align, Owl-ViT, CLIPSeg, or GroupViT, you will also need to install the `transformers` library:\n\n```shell\npip install transformers\n```\n\nIf you want to use SAM, you will also need to install the `segment-anything` library:\n\n```shell\npip install git+https://github.com/facebookresearch/segment-anything.git\n```\n\nIf you want to use OpenCLIP, you will also need to install the `open_clip` library from PyPI:\n\n```shell\npip install open-clip-torch\n```\n\nOr from source:\n\n```shell\npip install git+https://github.com/mlfoundations/open_clip.git\n```\n\nIf you want to use YOLO-World, you will also need to install the `ultralytics` library:\n\n```shell\npip install -U ultralytics\n```\n\n## Usage\n\nAll of the operators in this plugin can be run in _delegated_ execution mode. This means that instead of waiting for the operator to finish, you _schedule_\nthe operation to be performed separately. This is useful for long-running operations, such as performing inference on a large dataset.\n\nOnce you have pressed the `Schedule` button for the operator, you will be able to see the job from the command line using FiftyOne's [command line interface](https://docs.voxel51.com/cli/index.html#fiftyone-delegated-operations):\n\n```shell\nfiftyone delegated list\n```\n\nwill show you the status of all delegated operations.\n\nTo launch a service which runs the operation, as well as any other delegated operations that have been scheduled, run:\n\n```shell\nfiftyone delegated launch\n```\n\nOnce the operation has completed, you can view the results in the App (upon refresh).\n\nAfter the operation completes, you can also clean up your list of delegated operations by running:\n\n```shell\nfiftyone delegated cleanup -s COMPLETED\n```\n\n## Operators\n\n### `zero_shot_predict`\n\n- Select the task you want to perform zero-shot prediction on (image classification, object detection, instance segmentation, or semantic segmentation), and the field you want to add the results to.\n\n### `zero_shot_classify`\n\n- Perform zero-shot image classification on your dataset\n\n### `zero_shot_detect`\n\n- Perform zero-shot object detection on your dataset\n\n### `zero_shot_instance_segment`\n\n- Perform zero-shot instance segmentation on your dataset\n\n### `zero_shot_semantic_segment`\n\n- Perform zero-shot semantic segmentation on your dataset\n\n## Python SDK\n\nYou can also use the compute operators from the Python SDK!\n\n```python\nimport fiftyone as fo\nimport fiftyone.operators as foo\nimport fiftyone.zoo as foz\n\ndataset = fo.load_dataset(\"quickstart\")\n\n## Access the operator via its URI (plugin name + operator name)\nzsc = foo.get_operator(\"@jacobmarks/zero_shot_prediction/zero_shot_classify\")\n\n## Run zero-shot classification on all images in the dataset, specifying the labels with the `labels` argument\nzsc(dataset, labels=[\"cat\", \"dog\", \"bird\"])\n\n## Run zero-shot classification on all images in the dataset, specifying the labels with a text file\nzsc(dataset, labels_file=\"/path/to/labels.txt\")\n\n## Specify the model to use, and the field to add the results to\nzsc(dataset, labels=[\"cat\", \"dog\", \"bird\"], model_name=\"CLIP\", label_field=\"predictions\")\n\n## Run zero-shot detection on a view\nzsd = foo.get_operator(\"@jacobmarks/zero_shot_prediction/zero_shot_detect\")\nview = dataset.take(10)\nawait zsd(\n    view,\n    labels=[\"license plate\"],\n    model_name=\"OwlViT\",\n    label_field=\"owlvit_license_plate\",\n)\n```\n\nAll four of the task-specific zero-shot prediction operators also expose a `list_models()` method, which returns a list of the available models for that task.\n\n```python\nzsss = foo.get_operator(\n    \"@jacobmarks/zero_shot_prediction/zero_shot_semantic_segment\"\n)\n\nzsss.list_models()\n\n## ['CLIPSeg', 'GroupViT']\n```\n\n**Note**: The `zero_shot_predict` operator is not yet supported in the Python SDK.\n\n**Note**: With earlier versions of FiftyOne, you may have trouble running these\noperator executions within a Jupyter notebook. If so, try running them in a\nPython script, or upgrading to the latest version of FiftyOne!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjacobmarks%2Fzero-shot-prediction-plugin","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjacobmarks%2Fzero-shot-prediction-plugin","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjacobmarks%2Fzero-shot-prediction-plugin/lists"}