Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/krantiparida/AudioSetZSL
Dataset for Audio-Visual ZSL
https://github.com/krantiparida/AudioSetZSL
Last synced: about 1 month ago
JSON representation
Dataset for Audio-Visual ZSL
- Host: GitHub
- URL: https://github.com/krantiparida/AudioSetZSL
- Owner: krantiparida
- Created: 2019-09-23T19:30:08.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2023-07-12T07:03:50.000Z (over 1 year ago)
- Last Synced: 2024-08-01T22:41:47.212Z (4 months ago)
- Language: Python
- Homepage: https://www.cse.iitk.ac.in/users/kranti/avzsl.html
- Size: 3.04 MB
- Stars: 7
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-audio-visual - AudioSetZSL - Audio-Visual Zero-shot Learning (Datasets)
README
# AudioSetZSL
This repsoitory conatins the audio-visual dataset proposed for the task of multi-modal zeroshot learning.The dataset is curated from a large dataset, [AudioSet](https://research.google.com/audioset/).
While the original dataset was multilabel, the example videos were selected such that every video in AudioSetZSL has only one label, ie. it is a multiclass dataset. For more details on creation of the dataset, refer to our [paper](http://openaccess.thecvf.com/content_WACV_2020/papers/Parida_Coordinated_Joint_Multimodal_Embeddings_for_Generalized_Audio-Visual_Zero-shot_Classification_and_WACV_2020_paper.pdf).Here, we provide the Youtube IDs for each class in the folder ``` youtube-id```.
The dataset is divided into 2-parts for a broader use for both the task of classification and zero-shot learning.The examples for each class has been divided into three subsets namely, train, test and val.
Similary, for the task of ZSL the classes in the dataset is divided into seen and unseen.
We also provide the pre-trained features for both audio and video. The features are so obtained that it can be used for the task of ZSL as there is no unseen class overlap with the pre-training of the network (refer to our [paper](http://openaccess.thecvf.com/content_WACV_2020/papers/Parida_Coordinated_Joint_Multimodal_Embeddings_for_Generalized_Audio-Visual_Zero-shot_Classification_and_WACV_2020_paper.pdf) for the detailed process of the dataset split).
To download the pretrained feature follow the link : [Download](https://drive.google.com/drive/folders/1UNTOyfbqtsrwr1wHYJMln1r8JlL-cl9w?usp=sharing)## Contact
Kindly contact [email protected] for any issues, comments etc.## Disclaimer
1. The dataset collection was done at IIT Kanpur.
2. The dataset is intended to be used for academic research only.
3. The links are YouTube links and the user is responsible for compliance with YouTube's terms and conditions.
4. The videos are the property of the respective YouTube uploader. If any video belongs to you and you would like to have it removed kindly let us know and we will remove it from the dataset.