{"id":19630926,"url":"https://github.com/project-agml/agml","last_synced_at":"2025-05-15T17:01:33.350Z","repository":{"id":39583340,"uuid":"422590908","full_name":"Project-AgML/AgML","owner":"Project-AgML","description":"AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.","archived":false,"fork":false,"pushed_at":"2025-04-22T00:44:51.000Z","size":222079,"stargazers_count":212,"open_issues_count":11,"forks_count":32,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-05-02T17:59:49.831Z","etag":null,"topics":["agriculture","computer-vision","dataset","deep-learning","image-classification","object-detection","pytorch","semantic-segmentation","synthetic-data"],"latest_commit_sha":null,"homepage":"https://project-agml.github.io/AgML/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Project-AgML.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-10-29T13:44:40.000Z","updated_at":"2025-04-26T00:23:07.000Z","dependencies_parsed_at":"2023-02-15T17:46:12.669Z","dependency_job_id":"8dbb9a6b-a1da-43b7-b7c1-072638f3ba38","html_url":"https://github.com/Project-AgML/AgML","commit_stats":{"total_commits":847,"total_committers":28,"mean_commits":30.25,"dds":"0.20897284533648175","last_synced_commit":"0cb18b733aebfd9440dec94fdd927b7027f2b555"},"previous_names":[],"tags_count":27,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Project-AgML%2FAgML","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Project-AgML%2FAgML/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Project-AgML%2FAgML/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Project-AgML%2FAgML/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Project-AgML","download_url":"https://codeload.github.com/Project-AgML/AgML/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252231166,"owners_count":21715474,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agriculture","computer-vision","dataset","deep-learning","image-classification","object-detection","pytorch","semantic-segmentation","synthetic-data"],"created_at":"2024-11-11T12:07:09.713Z","updated_at":"2025-05-15T17:01:33.342Z","avatar_url":"https://github.com/Project-AgML.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/assets/agml-logo.png\" alt=\"agml logo\" width=\"400\" height=\"400\"\u003e\n\u003c/p\u003e\n\n----\n\n### 👨🏿‍💻👩🏽‍💻🌈🪴 Want to join the [AI Institute for Food Systems team](https://aifs.ucdavis.edu/) and help lead AgML development? 🪴🌈👩🏼‍💻👨🏻‍💻\n\nWe're looking to hire a postdoc with both Python library development and ML experience. Send your resume and GitHub profile link to [jmearles@ucdavis.edu](mailto:jmearles@ucdavis.edu)!\n\n----\n\n## Overview\nAgML is a comprehensive library for agricultural machine learning. Currently, AgML provides\naccess to a wealth of public agricultural datasets for common agricultural deep learning tasks. In the future, AgML will provide ag-specific ML functionality related to data, training, and evaluation. Here's a conceptual diagram of the overall framework.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/assets/agml-framework.png\" alt=\"agml framework\" width=\"350\" height=\"291\"\u003e\n\u003c/p\u003e\n\nAgML supports both the [TensorFlow](https://www.tensorflow.org/) and [PyTorch](https://pytorch.org/) machine learning frameworks.\n\n## Installation\n\nTo install the latest release of AgML, run the following command:\n\n```shell\npip install agml\n```\n\n**_NOTE:_** Some features of AgML, such as synthetic data generation, require GUI applications. When running AgML through\nWindows Subsystem for Linux (WSL), it may be necessary to configure your WSL environment to utilize these features. Please\nfollow the [Microsoft documentation](https://learn.microsoft.com/en-us/windows/wsl/tutorials/gui-apps) to install all\nnecessary prerequisites and update WSL. The latest version of WSL includes built-in support for running Linux GUI applications.\n\n## Quick Start\n\nAgML is designed for easy usage of agricultural data in a variety of formats. You can start off by using the `AgMLDataLoader` to\ndownload and load a dataset into a container:\n\n```python\nimport agml\n\nloader = agml.data.AgMLDataLoader('apple_flower_segmentation')\n```\n\nYou can then use the in-built processing methods to get the loader ready for your training and evaluation pipelines. This includes, but\nis not limited to, batching data, shuffling data, splitting data into training, validation, and test sets, and applying transforms.\n\n```python\nimport albumentations as A\n\n# Batch the dataset into collections of 8 pieces of data:\nloader.batch(8)\n\n# Shuffle the data:\nloader.shuffle()\n\n# Apply transforms to the input images and output annotation masks:\nloader.mask_to_channel_basis()\nloader.transform(\n    transform = A.RandomContrast(),\n    dual_transform = A.Compose([A.RandomRotate90()])\n)\n\n# Split the data into train/val/test sets.\nloader.split(train = 0.8, val = 0.1, test = 0.1)\n```\n\nThe split datasets can be accessed using `loader.train_data`, `loader.val_data`, and `loader.test_data`. Any further processing applied to the\nmain loader will be applied to the split datasets, until the split attributes are accessed, at which point you need to apply processing independently\nto each of the loaders. You can also turn toggle processing on and off using the `loader.eval()`, `loader.reset_preprocessing()`, and `loader.disable_preprocessing()`\nmethods.\n\nYou can visualize data using the `agml.viz` module, which supports multiple different types of visualization for different data types:\n\n```python\n# Disable processing and batching for the test data:\ntest_ds = loader.test_data\ntest_ds.batch(None)\ntest_ds.reset_prepreprocessing()\n\n# Visualize the image and mask side-by-side:\nagml.viz.visualize_image_and_mask(test_ds[0])\n\n# Visualize the mask overlaid onto the image:\nagml.viz.visualize_overlaid_masks(test_ds[0])\n```\n\nAgML supports both the TensorFlow and PyTorch libraries as backends, and provides functionality to export your loaders to native TensorFlow and PyTorch\nformats when you want to use them in a training pipeline. This includes both exporting the `AgMLDataLoader` to a `tf.data.Dataset` or `torch.utils.data.DataLoader`,\nbut also internally converting data within the `AgMLDataLoader` itself, enabling access to its core functionality.\n\n\n```python\n# Export the loader as a `tf.data.Dataset`:\ntrain_ds = loader.train_data.export_tensorflow()\n\n# Convert to PyTorch tensors without exporting.\ntrain_ds = loader.train_data\ntrain_ds.as_torch_dataset()\n```\n\nYou're now ready to use AgML for training your own models! Luckily, AgML comes with a training module that enables quick-start training of standard deep learning models on agricultural datasets. Training a grape detection model is as simple as the following code:\n\n```python\nimport agml\nimport agml.models\n\nimport albumentations as A\n\nloader = agml.data.AgMLDataLoader('grape_detection_californiaday')\nloader.split(train = 0.8, val = 0.1, test = 0.1)\nprocessor = agml.models.preprocessing.EfficientDetPreprocessor(\n    image_size = 512, augmentation = [A.HorizontalFlip(p=0.5)]\n)\nloader.transform(processor)\n\nmodel = agml.models.DetectionModel(num_classes=loader.num_classes)\n\nmodel.run_training(loader)\n```\n\n## Public Dataset Listing\n\nAgML contains a wide variety of public datasets from various locations across the world:\n\n![AgML Dataset World Map](/docs/assets/agml_dataset_world_map.png)\n\n\nThe following is a comprehensive list of all datasets available in AgML. For more information,\nyou can use `agml.data.public_data_sources(...)` with various filters to filter datasets according\nto your desired specification.\n\n\n| Dataset | Task | Number of Images |\n| :--- | ---: | ---: |\n[bean_disease_uganda](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/bean_disease_uganda.md) | Image Classification | 1295 | \n[carrot_weeds_germany](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/carrot_weeds_germany.md) | Semantic Segmentation | 60 | \n[plant_seedlings_aarhus](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/plant_seedlings_aarhus.md) | Image Classification | 5539 | \n[soybean_weed_uav_brazil](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/soybean_weed_uav_brazil.md) | Image Classification | 15336 | \n[sugarcane_damage_usa](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/sugarcane_damage_usa.md) | Image Classification | 153 | \n[crop_weeds_greece](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/crop_weeds_greece.md) | Image Classification | 508 | \n[sugarbeet_weed_segmentation](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/sugarbeet_weed_segmentation.md) | Semantic Segmentation | 1931 | \n[rangeland_weeds_australia](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/rangeland_weeds_australia.md) | Image Classification | 17509 | \n[fruit_detection_worldwide](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/fruit_detection_worldwide.md) | Object Detection | 565 | \n[leaf_counting_denmark](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/leaf_counting_denmark.md) | Image Classification | 9372 | \n[apple_detection_usa](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/apple_detection_usa.md) | Object Detection | 2290 | \n[mango_detection_australia](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/mango_detection_australia.md) | Object Detection | 1730 | \n[apple_flower_segmentation](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/apple_flower_segmentation.md) | Semantic Segmentation | 148 | \n[apple_segmentation_minnesota](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/apple_segmentation_minnesota.md) | Semantic Segmentation | 670 | \n[rice_seedling_segmentation](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/rice_seedling_segmentation.md) | Semantic Segmentation | 224 | \n[plant_village_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/plant_village_classification.md) | Image Classification | 55448 | \n[autonomous_greenhouse_regression](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/autonomous_greenhouse_regression.md) | Image Regression | 389 | \n[grape_detection_syntheticday](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/grape_detection_syntheticday.md) | Object Detection | 448 | \n[grape_detection_californiaday](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/grape_detection_californiaday.md) | Object Detection | 126 | \n[grape_detection_californianight](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/grape_detection_californianight.md) | Object Detection | 150 | \n[guava_disease_pakistan](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/guava_disease_pakistan.md) | Image Classification | 306 | \n[apple_detection_spain](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/apple_detection_spain.md) | Object Detection | 967 | \n[apple_detection_drone_brazil](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/apple_detection_drone_brazil.md) | Object Detection | 689 | \n[plant_doc_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/plant_doc_classification.md) | Image Classification | 2598 | \n[plant_doc_detection](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/plant_doc_detection.md) | Object Detection | 2598 | \n[wheat_head_counting](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/wheat_head_counting.md) | Object Detection | 6512 | \n[peachpear_flower_segmentation](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/peachpear_flower_segmentation.md) | Semantic Segmentation | 42 | \n[red_grapes_and_leaves_segmentation](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/red_grapes_and_leaves_segmentation.md) | Semantic Segmentation | 258 | \n[white_grapes_and_leaves_segmentation](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/white_grapes_and_leaves_segmentation.md) | Semantic Segmentation | 273 | \n[ghai_romaine_detection](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/ghai_romaine_detection.md) | Object Detection | 500 |\n[ghai_green_cabbage_detection](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/ghai_green_cabbage_detection.md) | Object Detection | 500 |\n[ghai_iceberg_lettuce_detection](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/ghai_iceberg_lettuce_detection.md) | Object Detection | 500 |\n[riseholme_strawberry_classification_2021](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/riseholme_strawberry_classification_2021.md) | Image Classification | 3520 |\n[ghai_broccoli_detection](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/ghai_broccoli_detection.md) | Object Detection | 500 |\n[bean_synthetic_earlygrowth_aerial](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/bean_synthetic_earlygrowth_aerial.md) | Semantic Segmentation | 2500 |\n[ghai_strawberry_fruit_detection](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/ghai_strawberry_fruit_detection.md) | Object Detection | 500 |\n[vegann_multicrop_presence_segmentation](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/vegann_multicrop_presence_segmentation.md) | Semantic Segmentation | 3775 |\n[corn_maize_leaf_disease](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/corn_maize_leaf_disease.md) | Image Classification | 4188 |\n[tomato_leaf_disease](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/tomato_leaf_disease.md) | Image Classification | 11000 |\n[vine_virus_photo_dataset](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/vine_virus_photo_dataset.md) | Image Classification | 3866 |\n[tomato_ripeness_detection](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/tomato_ripeness_detection.md) | Object Detection | 804 |\n[embrapa_wgisd_grape_detection](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/embrapa_wgisd_grape_detection.md) | Object Detection | 239 |\n[growliflower_cauliflower_segmentation](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/growliflower_cauliflower_segmentation.md) | Semantic Segmentation | 1542 |\n[strawberry_detection_2023](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/strawberry_detection_2023.md) | Object Detection | 204 |\n[strawberry_detection_2022](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/strawberry_detection_2022.md) | Object Detection | 175 |\n[almond_harvest_2021](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/almond_harvest_2021.md) | Object Detection | 50 |\n[almond_bloom_2023](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/almond_bloom_2023.md) | Object Detection | 100 |\n[gemini_flower_detection_2022](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/gemini_flower_detection_2022.md) | Object Detection | 134 |\n[gemini_leaf_detection_2022](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/gemini_leaf_detection_2022.md) | Object Detection | 25 |\n[gemini_pod_detection_2022](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/gemini_pod_detection_2022.md) | Object Detection | 98 |\n[gemini_plant_detection_2022](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/gemini_plant_detection_2022.md) | Object Detection | 402 |\n[paddy_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/paddy_disease_classification.md) | Image Classification | 10407 |\n[onion_leaf_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/onion_leaf_classification.md) | Image Classification | 4502 |\n[chilli_leaf_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/chilli_leaf_classification.md) | Image Classification | 10974 |\n[orange_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/orange_leaf_disease_classification.md) | Image Classification | 5813 |\n[papaya_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/papaya_leaf_disease_classification.md) | Image Classification | 2159 |\n[blackgram_plant_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/blackgram_plant_leaf_disease_classification.md) | Image Classification | 1007 |\n[arabica_coffee_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/arabica_coffee_leaf_disease_classification.md) | Image Classification | 58549 |\n[banana_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/banana_leaf_disease_classification.md) | Image Classification | 1288 |\n[coconut_tree_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/coconut_tree_disease_classification.md) | Image Classification | 5798 |\n[rice_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/rice_leaf_disease_classification.md) | Image Classification | 3829 |\n[tea_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/tea_leaf_disease_classification.md) | Image Classification | 5867 |\n[betel_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/betel_leaf_disease_classification.md) | Image Classification | 3589 |\n[java_plum_leaf_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/java_plum_leaf_disease_classification.md) | Image Classification | 2400 |\n[sunflower_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/sunflower_disease_classification.md) | Image Classification | 2358 |\n[cucumber_disease_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/cucumber_disease_classification.md) | Image Classification | 7689 |\n[iNatAg](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/iNatAg.md) | Image Classification | 4720903 |\n[iNatAg-mini](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/iNatAg-mini.md) | Image Classification | 560844 |\n[soybean_insect_classification](https://github.com/Project-AgML/AgML/blob/main/docs/datasets/soybean_insect_classification.md) | Image Classification | 6410 |\n\n## iNatAg and iNatAg-mini\n\n\nAgML provides an API with direct access to iNatAg (and iNatAg-mini), one of the world's largest collections of agricultural images dedicated for the task of image classification. Collectively, this dataset contains over 4 million images along with detailed species classificaations and enables access to a variety of large-scale agricultural machine learning tasks. You can instantiate the iNatAg (or iNatAg-mini, a smaller variant of iNatAg for smaller-scale applications) dataset as follows:\n\n```python\n# To select a collection of scientific family names.\nloader = agml.data.AgMLDataLoader.from_parent(\"iNatAg\", filters={\"family_name\": [\"...\", \"...\"]})\n\n# To select common names.\nloader = agml.data.AgMLDataLoader.from_parent(\"iNatAg\", filters={\"common_name\": \"...\"})\n```\n\n\n## Usage Information\n\n### Using Public Agricultural Data\n\nAgML aims to provide easy access to a range of existing public agricultural datasets The core of AgML's public data pipeline is\n[`AgMLDataLoader`](/agml/data/loader.py). You can use the `AgMLDataLoader` or `agml.data.download_public_dataset()` to download\nthe dataset locally from which point it will be automatically loaded from the disk on future runs.\nFrom this point, the data within the loader can be split into train/val/test sets, batched, have augmentations and transforms\napplied, and be converted into a training-ready dataset (including batching, tensor conversion, and image formatting).\n\nTo see the various ways in which you can use AgML datasets in your training pipelines, check out\nthe [example notebook](/examples/AgML-Data.ipynb).\n\n## Annotation Formats\n\nA core aim of AgML is to provide datasets in a standardized format, enabling the synthesizing of multiple datasets\ninto a single training pipeline. To this end, we provide annotations in the following formats:\n\n- **Image Classification**: Image-To-Label-Number\n- **Object Detection**: [COCO JSON](https://cocodataset.org/#format-data)\n- **Semantic Segmentation**: Dense Pixel-Wise\n\n## Contributions\n\nWe welcome contributions! If you would like to contribute a new feature, fix an issue that you've noticed, or even just mention\na bug or feature that you would like to see implemented, please don't hesitate to use the *Issues* tab to bring it to our attention.\n\nSee the [contributing guidelines](/CONTRIBUTING.md) for more information.\n\n## Funding\nThis project is partly funded by the [National AI Institute for Food Systems](https://aifs.ucdavis.edu).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fproject-agml%2Fagml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fproject-agml%2Fagml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fproject-agml%2Fagml/lists"}