https://github.com/reddiedev/197z-kws
zero-shot keyword spotting with KWS test dataset using ImageBind
https://github.com/reddiedev/197z-kws
imagebind kws pytorch-audio speech-commands zero-shot
Last synced: 9 months ago
JSON representation
zero-shot keyword spotting with KWS test dataset using ImageBind
- Host: GitHub
- URL: https://github.com/reddiedev/197z-kws
- Owner: reddiedev
- License: mit
- Created: 2023-05-31T01:35:24.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-06-12T15:21:55.000Z (over 2 years ago)
- Last Synced: 2025-05-15T18:13:25.189Z (9 months ago)
- Topics: imagebind, kws, pytorch-audio, speech-commands, zero-shot
- Language: Jupyter Notebook
- Homepage:
- Size: 2.72 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# EEE 197z Project 2 - Zero-Shot KWS using ImageBind
an algorithm using ImageBind that will classify KWS test dataset (Google Speech Commands v2 35) in zero-shot manner.
*Author: Sean Red Mendoza | 2020-01751 | scmendoza5@up.edu.ph*
## Tools/ References
- [ImageBind](https://github.com/facebookresearch/ImageBind)
- [Gradio](https://gradio.app/)
- [roatienza/Deep-Learning-Experiments](https://github.com/roatienza/Deep-Learning-Experiments)
- [Google Cloud G2 GPU VM (2x Nvidia L4)](https://cloud.google.com/blog/products/compute/introducing-g2-vms-with-nvidia-l4-gpus)
## Goals
- [x] randomly pick an audio from the test split and classify it (audio player in UI)
- [x] user should be able to record his/her own voice for testing (audio recorder in UI, powered by Gradio)
- [x] show summary statistics during evaluation of n sampels (# of data points, accuracy).
- [x] comparison table of SOTA model scores
# Usage
1. Duplicate this repository on a working directory
```bash
git clone https://github.com/reddiedev/197z-kws
cd 197z-kws
```
2. Prepare environment for running the notebook
```bash
conda create --name kws
conda activate kws
sudo apt install ffmpeg
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip jupyter jupyterlab ipywidgets==7.6.5 install numpy ipython gradio ipywebrtc notebook
jupyter labextension install jupyter-webrtc
```
3. Run the `demo.ipynb` jupyter notebook
4. View SOTA models comparison in `comparison.md`
# Acknowledgements
[Professor, Rowel Atienza](https://github.com/roatienza)