https://github.com/krantiparida/AudioSetZSL

Dataset for Audio-Visual ZSL
https://github.com/krantiparida/AudioSetZSL

Last synced: 23 days ago
JSON representation

Dataset for Audio-Visual ZSL

Host: GitHub
URL: https://github.com/krantiparida/AudioSetZSL
Owner: krantiparida
Created: 2019-09-23T19:30:08.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2023-07-12T07:03:50.000Z (almost 2 years ago)
Last Synced: 2024-11-08T18:46:12.857Z (6 months ago)
Language: Python
Homepage: https://www.cse.iitk.ac.in/users/kranti/avzsl.html
Size: 3.04 MB
Stars: 7
Watchers: 2
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-audio-visual - AudioSetZSL - Audio-Visual Zero-shot Learning (Datasets)

README

        # AudioSetZSL

This repsoitory conatins the audio-visual dataset proposed for the task of multi-modal zeroshot learning.

The dataset is curated from a large dataset, [AudioSet](https://research.google.com/audioset/). 

While the original dataset was multilabel, the example videos were selected such that every video in AudioSetZSL has only one label, ie. it is a multiclass dataset. For more details on creation of the dataset, refer to our [paper](http://openaccess.thecvf.com/content_WACV_2020/papers/Parida_Coordinated_Joint_Multimodal_Embeddings_for_Generalized_Audio-Visual_Zero-shot_Classification_and_WACV_2020_paper.pdf).

Here, we provide the Youtube IDs for each class in the folder ``` youtube-id```.

The dataset is divided into 2-parts for a broader use for both the task of classification and zero-shot learning.

The examples for each class has been divided into three subsets namely, train, test and val.

Similary, for the task of ZSL the classes in the dataset is divided into seen and unseen.

We also provide the pre-trained features for both audio and video. The features are so obtained that it can be used for the task of ZSL as there is no unseen class overlap with the pre-training of the network (refer to our [paper](http://openaccess.thecvf.com/content_WACV_2020/papers/Parida_Coordinated_Joint_Multimodal_Embeddings_for_Generalized_Audio-Visual_Zero-shot_Classification_and_WACV_2020_paper.pdf) for the detailed process of the dataset split).

To download the pretrained feature follow the link : [Download](https://drive.google.com/drive/folders/1UNTOyfbqtsrwr1wHYJMln1r8JlL-cl9w?usp=sharing)

## Contact

Kindly contact [email protected] for any issues, comments etc. 

## Disclaimer

1. The dataset collection was done at IIT Kanpur. 

2. The dataset is intended to be used for academic research only. 

3. The links are YouTube links and the user is responsible for compliance with YouTube's terms and conditions. 

4. The videos are the property of the respective YouTube uploader. If any video belongs to you and you would like to have it removed kindly let us know and we will remove it from the dataset.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/krantiparida/AudioSetZSL

Awesome Lists containing this project

README