{"id":32647564,"url":"https://github.com/jacobmarks/active-learning-plugin","last_synced_at":"2025-10-31T05:55:30.155Z","repository":{"id":197857856,"uuid":"693406158","full_name":"jacobmarks/active-learning-plugin","owner":"jacobmarks","description":"Label your dataset with active learning in FiftyOne!","archived":false,"fork":false,"pushed_at":"2024-04-04T23:40:36.000Z","size":37,"stargazers_count":6,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-04-16T07:21:07.573Z","etag":null,"topics":["active-learning","annotation","computer-vision","data-cu","fiftyone","labeling","plugin","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jacobmarks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-09-19T01:11:01.000Z","updated_at":"2024-01-09T17:28:10.000Z","dependencies_parsed_at":"2023-10-03T02:58:19.689Z","dependency_job_id":"71ee8b14-850e-4777-ad2e-89e35660abbd","html_url":"https://github.com/jacobmarks/active-learning-plugin","commit_stats":null,"previous_names":["jacobmarks/active-learning-plugin"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jacobmarks/active-learning-plugin","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Factive-learning-plugin","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Factive-learning-plugin/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Factive-learning-plugin/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Factive-learning-plugin/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jacobmarks","download_url":"https://codeload.github.com/jacobmarks/active-learning-plugin/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jacobmarks%2Factive-learning-plugin/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281937758,"owners_count":26586774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-31T02:00:07.401Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["active-learning","annotation","computer-vision","data-cu","fiftyone","labeling","plugin","python"],"created_at":"2025-10-31T05:55:24.973Z","updated_at":"2025-10-31T05:55:30.147Z","avatar_url":"https://github.com/jacobmarks.png","language":"Python","readme":"## 🏃 Active Learning 🏃\n\n![first_query_compressed](https://github.com/jacobmarks/active-learning-plugin/assets/12500356/aadcfa66-1e0f-4a56-b86f-07850bfae94a)\n\nWhen it comes to machine learning, one of the most time-consuming and costly parts of the process is data annotation. Especially in the realm of computer vision, labeling images or videos can be an incredibly laborious task, often requiring a team of annotators and hours of meticulous work to generate high-quality labels.\n\nWhat if you could make this process smarter and more efficient? Enter [Active Learning](https://en.wikipedia.org/wiki/Active_learning_machine_learning) — a paradigm that iteratively selects the most \"informative\" or \"ambiguous\" examples for labeling, thereby reducing the amount of manual annotation needed. In practical terms, this means your model gets better, faster, and with fewer labeled samples.\n\nThis FiftyOne plugin brings Active Learning to your computer vision data using the\n[modAL](https://modal-python.readthedocs.io/en/latest/) library, allowing you to integrate this accelerant directly into your annotation workflow. Now you can prioritize, query, and annotate the most crucial data points, all within the FiftyOne App—no coding necessary.\n\nThe best part? You can use this in tandem with your traditional annotation service providers (via FiftyOne’s integrations with CVAT, Labelbox and Label Studio), or even with the FiftyOne [Zero-shot Prediction plugin](https://github.com/jacobmarks/zero-shot-prediction-plugin)!\n\n## Watch On Youtube\n[![Video Thumbnail](https://img.youtube.com/vi/j4z5zlfO3Pc/0.jpg)](https://www.youtube.com/watch?v=j4z5zlfO3Pc\u0026list=PLuREAXoPgT0RZrUaT0UpX_HzwKkoB-S9j\u0026index=8)\n\n\n## Installation\n\n```shell\nfiftyone plugins download https://github.com/jacobmarks/active-learning-plugin\n```\n\nThen install the requirements:\n\n```shell\nfiftyone plugins requirements @jacobmarks/active_learning --install\n```\n\n## Operators\n\n### `create_active_learner`\n\nCreates an active learning model and environment. The learner is initialized from a set of initial labels and input features.\n\nWe can choose:\n\n- The field or fields to use as a feature vector\n- The label field in which to store predictions\n- The default batch size — the number of samples per query\n- The `Active Learner`\n\nFor the latter of these, we can select from a variety of ensemble strategies, including Random Forest, Gradient Boosting, Bagging, and AdaBoost. When we make this top-level selection, the remainder of the form dynamically updates with appropriate hyperparameter configuration choices.\n\nExecuting this operator creates a modAL `ActiveLearner` that uses an “uncertainty” batch sampling. The execution also invokes the generation of initial predictions, and triggers the reload of the dataset.\n\n### `query_learner`\n\nQueries the active learner for the next samples to label. If you'd like, you can override the default query batch size.\n\nTag the samples whose predicted labels are incorrect. Untagged samples will be treated as correct predictions.\n\n### `update_learner_predictions`\n\nAfter correcting the incorrect query labels, we can update our active learner by “teaching” it this new information. Running this operator updates our active learning model, updates the label field with new predictions, and reloads the app.\n\n## Usage\n\n### 0. Generate Initial Labels\n\nBefore we can create an active learner, we need to generate some initial labels. We can do this using the [Zero-shot Prediction plugin](https://github.com/jacobmarks/zero-shot-prediction-plugin):\n\n![zero_shot_labels_compressed](https://github.com/jacobmarks/active-learning-plugin/assets/12500356/08d62bc6-7a76-4be7-bdcf-331c9243f123)\n\n\nAlternatively, we can use tags on some of our samples as labels, so long as they are mutually exclusive.\n\n### 1. Create Input Features\n\nNext, we need to populate fields on our samples with numerical attributes (floats or arrays) that we can use as input features for our active learner.\n\nA common choice is model embeddings, which can be computed either in the FiftyOne App, or in Python:\n\n```python\nimport fiftyone as fo\nimport fiftyone.zoo as foz\n\nmobilenet = foz.load_zoo_model(\"mobilenet-v2-imagenet-torch\")\ndataset.compute_embeddings(mobilenet, embeddings_field=\"mobilenet_embeddings\")\n```\n\nYou can also add float-valued fields. For example, using the [Image Quality Issues Plugin](https://github.com/jacobmarks/image-quality-issues) you can compute the brightness, contrast, and saturation of your images!\n\n### 2. Create an Active Learner\n\nNow we're ready to create an active learner. We can do this using the `create_active_learner` operator:\n\n![create_active_learner_compressed](https://github.com/jacobmarks/active-learning-plugin/assets/12500356/cebdde3a-e090-45f0-a1a9-8a6bc8276b3e)\n\n### 3. Query the Active Learner\n\nOnce we've created an active learner, we can query it for the next batch of samples to label. We can do this using the `query_learner` operator:\n\n![first_query_compressed](https://github.com/jacobmarks/active-learning-plugin/assets/12500356/bf34227c-a52a-4414-837e-494f6ebd9d5f)\n\nWe then tag the samples whose predicted labels are incorrect. Untagged samples will be treated as correct predictions:\n\n![correct_first_query_compressed](https://github.com/jacobmarks/active-learning-plugin/assets/12500356/8ba63f13-4bbd-4fbe-ad3c-4ac7bfaefb7d)\n\n### 4. Update the Active Learner\n\nAfter correcting the incorrect query labels, we can update our active learner by “teaching” it this new information. We can do this using the `update_learner_predictions` operator:\n\n![teach_learner_compressed](https://github.com/jacobmarks/active-learning-plugin/assets/12500356/0c88b566-3734-4f03-b0f7-7d92da5b2444)\n\n### 5. Repeat!\n\nNow we can repeat steps 3 and 4 until we're satisfied with our model's performance.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjacobmarks%2Factive-learning-plugin","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjacobmarks%2Factive-learning-plugin","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjacobmarks%2Factive-learning-plugin/lists"}