{"id":13738260,"url":"https://github.com/plantnet/PlantNet-300K","last_synced_at":"2025-05-08T16:33:01.081Z","repository":{"id":51446885,"uuid":"364269445","full_name":"plantnet/PlantNet-300K","owner":"plantnet","description":"[NeurIPS2021] A plant image dataset with high label ambiguity and a long-tailed distribution","archived":false,"fork":false,"pushed_at":"2024-08-01T08:38:37.000Z","size":676,"stargazers_count":198,"open_issues_count":8,"forks_count":33,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-04-09T14:13:22.832Z","etag":null,"topics":["dataset","deep-learning","plants","pytorch"],"latest_commit_sha":null,"homepage":"https://doi.org/10.5281/zenodo.5645731","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/plantnet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-04T13:42:08.000Z","updated_at":"2025-03-28T17:24:32.000Z","dependencies_parsed_at":"2024-11-07T18:35:23.700Z","dependency_job_id":"6e8c1704-3fc2-4710-915d-8e37daf0f40a","html_url":"https://github.com/plantnet/PlantNet-300K","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plantnet%2FPlantNet-300K","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plantnet%2FPlantNet-300K/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plantnet%2FPlantNet-300K/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plantnet%2FPlantNet-300K/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/plantnet","download_url":"https://codeload.github.com/plantnet/PlantNet-300K/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253105412,"owners_count":21855019,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","deep-learning","plants","pytorch"],"created_at":"2024-08-03T03:02:16.306Z","updated_at":"2025-05-08T16:33:00.755Z","avatar_url":"https://github.com/plantnet.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# PlantNet-300K\n\n\u003cp align=\"middle\"\u003e\n  \u003cimg src=\"/images/1.jpg\" width=\"180\" hspace=\"2\"/\u003e\n  \u003cimg src=\"/images/2.jpg\" width=\"180\" hspace=\"2\"/\u003e\n  \u003cimg src=\"/images/3.jpg\" width=\"180\" hspace=\"2\"/\u003e\n  \u003cimg src=\"/images/4.jpg\" width=\"180\" hspace=\"2\"/\u003e\n\u003c/p\u003e\n\nThis repository contains the code used to produce the benchmark in the paper ***\"Pl@ntNet-300K: a plant image dataset with high label\nambiguity and a long-tailed distribution\"***.\n\n## Download the dataset\n\nIn order to train a model on the PlantNet-300K dataset, you first have to [download the dataset on Zenodo](https://zenodo.org/record/5645731#.Yuehg3ZBxPY).\n\n## Scientific Publication\n\nYou can find detailed information about the dataset as well as extensive experiments in the [NeurIPS 2021 paper](https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/file/7e7757b1e12abcb736ab9a754ffb617a-Paper-round2.pdf).\nIf you use this work for your research, please cite the paper:\n\n    @inproceedings{plantnet-300k,\n    author    = {Garcin, Camille and Joly, Alexis and Bonnet, Pierre and Lombardo, Jean-Christophe and Affouard, Antoine and Chouet, Mathias and Servajean, Maximilien and Lorieul, Titouan and Salmon, Joseph},\n    booktitle = {NeurIPS Datasets and Benchmarks 2021},\n    title     = {{Pl@ntNet-300K}: a plant image dataset with high label ambiguity and a long-tailed distribution},\n    year      = {2021},\n    }\n    \n## Overview\n\nPl@ntNet-300K is a plant dataset containing 306,146 plant images covering 1081 species (the classes).\nPl@ntNet-300K is characterized by high class ambiguity and strong class imbalance.\nThe graph below highlights the long-tailed distribution of the dataset: 80% of species account for only 11% of the total number of images.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"images/lorentz_curve.jpg\" width=\"50%\"\u003e\n\u003c/p\u003e\n\nThe images are split into a train, val and test set, each containing the following number of images : \n\n\u003cdiv align=\"center\"\u003e\n\n| Train | Val | Test | Total\n|-----------------|-----------------|-----------------|-----------------|\n| 243,916         | 31,118          | 31,112          | 306,146 |\n\n\u003c/div\u003e\n\n### Dataset Version \u0026 Meta-data files\n\nMake sure you download the latest version of the dataset in Zenodo (version 1.1 as in the link above, not 1.0).\nThe difference lies in the metadata files, the images are the same.\nIf you wish to download **ONLY** the metadata files (not possible in Zenodo), you will find them [here](https://lab.plantnet.org/seafile/d/bed81bc15e8944969cf6/).\nThe folder contains three files: \n\n- `plantnet300K_metadata.json`  maps the id of each image with several pieces of information (species id, split, author, license, ...) \n- `plantnet300K_species_id_2_name.json`, maps the species id and its scientific name\n- `class_idx_to_species_id.json`, maps the class id (from 0 to 1080) to the species id (useful for the pretrained weights)\n\n### Hyperparameters\n\nIf you are looking for the hyperparameters used in the paper, you can find them in the [supplementary material](https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/7e7757b1e12abcb736ab9a754ffb617a-Abstract-round2.html).\n\n\n### Pre-trained models\n\nYou can find the pre-trained models [here](https://lab.plantnet.org/seafile/d/01ab6658dad6447c95ae/).\nTo load the pre-trained models, you can simply use the `load_model` function in `utils.py`. For instance, if you want to load the resnet18 weights:\n\n```python\nfrom utils import load_model\nfrom torchvision.models import resnet18\n\nfilename = 'resnet18_weights_best_acc.tar' # pre-trained model path\nuse_gpu = True  # load weights on the gpu\nmodel = resnet18(num_classes=1081) # 1081 classes in Pl@ntNet-300K\n\nload_model(model, filename=filename, use_gpu=use_gpu)\n```\n\nNote that if you want to fine-tune the model on another dataset, you have to change the last layer. You can find examples in the `get_model` function in `utils.py. \n### Requirements\n\nOnly pytorch, torchvision are necessary for the code to run. \nIf you have installed anaconda, you can run the following command:\n\n```conda env create -f plantnet_300k_env.yml```\n\n### Training a model\n\nIn order to train a model on the PlantNet-300K dataset, run the following command:\n\n```python main.py --lr=0.01 --batch_size=32 --mu=0.0001 --n_epochs=30 --epoch_decay 20 25 --k 1 3 5 10 --model=resnet18 --pretrained --seed=4 --image_size=256 --crop_size=224 --root=path_to_data --save_name_xp=xp1```\n\n You must provide in the `root` option the path to the train val and test folders (here: `path_to_data`). \n The `save_name_xp` option is the name of the directory where the weights of the model and the results (metrics) will be stored.\n You can check out the different options in the file `cli.py`.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplantnet%2FPlantNet-300K","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fplantnet%2FPlantNet-300K","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplantnet%2FPlantNet-300K/lists"}