{"id":13948294,"url":"https://github.com/WildlifeDatasets/wildlife-datasets","last_synced_at":"2025-07-20T10:30:34.565Z","repository":{"id":65929694,"uuid":"546057103","full_name":"WildlifeDatasets/wildlife-datasets","owner":"WildlifeDatasets","description":"WildlifeDatasets: An open-source toolkit for animal re-identification","archived":false,"fork":false,"pushed_at":"2024-10-31T09:07:24.000Z","size":228845,"stargazers_count":71,"open_issues_count":0,"forks_count":6,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-23T14:02:13.779Z","etag":null,"topics":["dataset","datasets","deep-learning","ecology","ecology-modelling","machine-learning"],"latest_commit_sha":null,"homepage":"https://wildlifedatasets.github.io/wildlife-datasets/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WildlifeDatasets.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-05T12:57:28.000Z","updated_at":"2024-11-21T17:39:58.000Z","dependencies_parsed_at":null,"dependency_job_id":"b889e0ea-bec8-454e-81b1-758c2bea817e","html_url":"https://github.com/WildlifeDatasets/wildlife-datasets","commit_stats":{"total_commits":678,"total_committers":4,"mean_commits":169.5,"dds":0.5029498525073746,"last_synced_commit":"ca235ce655fcccdebbc896c8f3a839f7cb00e4d4"},"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WildlifeDatasets%2Fwildlife-datasets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WildlifeDatasets%2Fwildlife-datasets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WildlifeDatasets%2Fwildlife-datasets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WildlifeDatasets%2Fwildlife-datasets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WildlifeDatasets","download_url":"https://codeload.github.com/WildlifeDatasets/wildlife-datasets/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":226177487,"owners_count":17585910,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","datasets","deep-learning","ecology","ecology-modelling","machine-learning"],"created_at":"2024-08-08T05:01:19.930Z","updated_at":"2025-07-20T10:30:34.550Z","avatar_url":"https://github.com/WildlifeDatasets.png","language":"Jupyter Notebook","funding_links":[],"categories":["Biosphere"],"sub_categories":["Terrestrial Wildlife"],"readme":"\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/issues\"\u003e\u003cimg src=\"https://img.shields.io/github/issues/WildlifeDatasets/wildlife-datasets\" alt=\"GitHub issues\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/pulls\"\u003e\u003cimg src=\"https://img.shields.io/github/issues-pr/WildlifeDatasets/wildlife-datasets\" alt=\"GitHub pull requests\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/graphs/contributors\"\u003e\u003cimg src=\"https://img.shields.io/github/contributors/WildlifeDatasets/wildlife-datasets\" alt=\"GitHub contributors\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/network/members\"\u003e\u003cimg src=\"https://img.shields.io/github/forks/WildlifeDatasets/wildlife-datasets\" alt=\"GitHub forks\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/stargazers\"\u003e\u003cimg src=\"https://img.shields.io/github/stars/WildlifeDatasets/wildlife-datasets\" alt=\"GitHub stars\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/watchers\"\u003e\u003cimg src=\"https://img.shields.io/github/watchers/WildlifeDatasets/wildlife-datasets\" alt=\"GitHub watchers\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/github/license/WildlifeDatasets/wildlife-datasets\" alt=\"License\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/datasets-logo.png\" alt=\"Wildlife datasets\" width=\"300\"\u003e\n\u003c/p\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003cp align=\"center\"\u003ePipeline for wildlife re-identification including dataset zoo, training tools and trained models. Usage includes classifying new images in labelled databases and clustering individuals in unlabelled databases.\u003c/p\u003e\n  \u003ca href=\"https://wildlifedatasets.github.io/wildlife-datasets/\"\u003eDocumentation\u003c/a\u003e\n  ·\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/issues/new?assignees=aerodynamic-sauce-pan\u0026labels=bug\u0026projects=\u0026template=bug_report.md\u0026title=%5BBUG%5D\"\u003eReport Bug\u003c/a\u003e\n  ·\n  \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-datasets/issues/new?assignees=aerodynamic-sauce-pan\u0026labels=enhancement\u0026projects=\u0026template=enhancement.md\u0026title=%5BEnhancement%5D\"\u003eRequest Feature\u003c/a\u003e\n  ·\n  \u003ca href=\"mailto:wilddatasets@gmail.com\"\u003e:mailbox_with_mail:Email\u003c/a\u003e\n\u003c/div\u003e\n\n\u003c/br\u003e\n\n| \u003ca href=\"https://www.kaggle.com/datasets/wildlifedatasets/wildlifereid-10k\"\u003e\u003cimg src=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/wildlifereID10k-logo.png\" alt=\"WildlifeReID-10k\" width=\"200\"\u003e\u003c/a\u003e  | \u003ca href=\"https://huggingface.co/BVRA/MegaDescriptor-L-384\"\u003e\u003cimg src=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/megadescriptor-logo.png\" alt=\"MegaDescriptor\" width=\"200\"\u003e\u003c/a\u003e | \u003ca href=\"https://github.com/WildlifeDatasets/wildlife-tools\"\u003e\u003cimg src=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/tools-logo.png\" alt=\"Wildlife tools\" width=\"200\"\u003e\u003c/a\u003e |\n|:--------------:|:-----------:|:------------:|\n| Dataset for identification of individual animals | Trained model for individual re\u0026#x2011;identification  | Tools for training re\u0026#x2011;identification models |\n\n\u003c/br\u003e\n\n## Wildlife Re-Identification (Re-ID) Datasets\n\nThe aim of the project is to provide comprehensive overview of datasets for wildlife individual re-identification and an easy-to-use package for developers of machine learning methods. The core functionality includes:\n\n- overview of 44 publicly available wildlife re-identification datasets.\n- utilities to mass download and convert them into a unified format and fix some wrong labels.\n- default splits for several machine learning tasks including the ability create additional splits.\n\nAn introductory example is provided in a [Jupyter notebook](notebooks/introduction.ipynb). The package provides a natural synergy with [Wildlife tools](https://github.com/WildlifeDatasets/wildlife-tools), which provides our [MegaDescriptor](https://huggingface.co/BVRA/MegaDescriptor-L-384) model and tools for training neural networks. \n\nDo you know about any animal re-identification dataset which is not included? Post it to the [discussion forum](https://github.com/WildlifeDatasets/wildlife-datasets/discussions/6) please.\n\n## Changelog\n\n[14/04/2025] Added AnimalCLEF2025, WildlifeReID-10k (unifications of multiple datasets), MultiCamCows2024 (cows) and PrimFace (primates).  \n[31/10/2024] Added AmvrakikosTurtles, ReunionTurtles, SouthernProvinceTurtles, ZakynthosTurtles (sea turtles), ELPephants (elephants) and Chicks4FreeID (chickens).  \n[09/05/2024] Added CatIndividualImages (cats), CowDataset (cows) and DogFaceNet (dogs).  \n[28/02/2024] Added MPDD (dogs), PolarBearVidID (polar bears) and SeaStarReID2023 (sea stars).  \n[04/01/2024] Received **Best paper award** at WACV 2024.\n\n## Summary of datasets\n\nAn overview of the provided datasets is available in the [documentation](https://wildlifedatasets.github.io/wildlife-datasets/datasets/), while the more numerical summary is located in a [Jupyter notebook](notebooks/dataset_descriptions.ipynb). Due to its size, it may be necessary to view it via [nbviewer](https://nbviewer.org/github/WildlifeDatasets/wildlife-datasets/blob/main/notebooks/dataset_descriptions.ipynb).\n\nWe include basic characteristics such as publication years, number of images, number of individuals, dataset time span (difference between the last and first image taken) and additional information such as source, number of poses, inclusion of timestamps, whether the animals were captured in the wild and whether the dataset contain multiple species.\n\n\u003cpicture\u003e\n  \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/Datasets_Summary_inverted.png\"\u003e\n  \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/Datasets_Summary.png\"\u003e\n  \u003cimg alt=\"Dataset summary\" src=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/Datasets_Summary.png\"\u003e\n\u003c/picture\u003e\n\n\n## Installation\n\nThe installation of the package is simple by\n```\npip install wildlife-datasets\n```\n\n\n## Basic functionality\n\nWe show an example of downloading, extracting and processing the MacaqueFaces dataset.\n\n```\nfrom wildlife_datasets import analysis, datasets\n\ndatasets.MacaqueFaces.get_data('data/MacaqueFaces')\ndataset = datasets.MacaqueFaces('data/MacaqueFaces')\n```\n\nThe class `dataset` contains the summary of the dataset. The content depends on the dataset. Each dataset contains the identity and paths to images. This particular dataset also contains information about the date taken and contrast. Other datasets store information about bounding boxes, segmentation masks, position from which the image was taken, keypoints or various other information such as age or gender.\n\n```\ndataset.df\n```\n\n\u003cpicture\u003e\n  \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/MacaqueFaces_DataFrame_inverted.png\"\u003e\n  \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/MacaqueFaces_DataFrame.png\"\u003e\n  \u003cimg alt=\"Overview of the MacaqueFaces dataset\" src=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/MacaqueFaces_DataFrame.png\"\u003e\n\u003c/picture\u003e\n\nThe dataset also contains basic metadata including information about the number of individuals, time span, licences or published year.\n\n```\ndataset.summary\n```\n\n\u003cpicture\u003e\n  \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/MacaqueFaces_Metadata_inverted.png\"\u003e\n  \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/MacaqueFaces_Metadata.png\"\u003e\n  \u003cimg alt=\"Metadata of the MacaqueFaces dataset\" src=\"https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/MacaqueFaces_Metadata.png\"\u003e\n\u003c/picture\u003e\n\nThis particular dataset already contains cropped images of faces. Other datasets may contain uncropped images with bounding boxes or even segmentation masks.\n\n```\nd.plot_grid()\n```\n\n![](https://github.com/WildlifeDatasets/wildlife-datasets/raw/main/docs/resources/MacaqueFaces_Grid.png)\n\n## Additional functionality\n\nFor additional functionality including mass loading, datasets splitting or evaluation metrics we refer to the [documentation](https://wildlifedatasets.github.io/wildlife-datasets/) or the [notebooks](https://github.com/WildlifeDatasets/wildlife-datasets/tree/main/notebooks).\n\n## Citation\n\nIf you like our package, please cite our [paper](https://openaccess.thecvf.com/content/WACV2024/html/Cermak_WildlifeDatasets_An_Open-Source_Toolkit_for_Animal_Re-Identification_WACV_2024_paper.html). You may be also interested in our [SeaTurtleID2022](https://www.kaggle.com/datasets/wildlifedatasets/seaturtleid2022) dataset published in another [paper](https://openaccess.thecvf.com/content/WACV2024/html/Adam_SeaTurtleID2022_A_Long-Span_Dataset_for_Reliable_Sea_Turtle_Re-Identification_WACV_2024_paper.html).\n\n```\n@InProceedings{Cermak_2024_WACV,\n    author    = {\\v{C}erm\\'ak, Vojt\\v{e}ch and Picek, Luk\\'a\\v{s} and Adam, Luk\\'a\\v{s} and Papafitsoros, Kostas},\n    title     = {{WildlifeDatasets: An Open-Source Toolkit for Animal Re-Identification}},\n    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},\n    month     = {January},\n    year      = {2024},\n    pages     = {5953-5963}\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FWildlifeDatasets%2Fwildlife-datasets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FWildlifeDatasets%2Fwildlife-datasets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FWildlifeDatasets%2Fwildlife-datasets/lists"}