https://github.com/reddiedev/197z-kws

zero-shot keyword spotting with KWS test dataset using ImageBind
https://github.com/reddiedev/197z-kws

imagebind kws pytorch-audio speech-commands zero-shot

Last synced: 9 months ago
JSON representation

zero-shot keyword spotting with KWS test dataset using ImageBind

Host: GitHub
URL: https://github.com/reddiedev/197z-kws
Owner: reddiedev
License: mit
Created: 2023-05-31T01:35:24.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-06-12T15:21:55.000Z (over 2 years ago)
Last Synced: 2025-05-15T18:13:25.189Z (9 months ago)
Topics: imagebind, kws, pytorch-audio, speech-commands, zero-shot
Language: Jupyter Notebook
Homepage:
Size: 2.72 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # EEE 197z Project 2 - Zero-Shot KWS using ImageBind

an algorithm using ImageBind that will classify KWS test dataset (Google Speech Commands v2 35) in zero-shot manner. 

*Author: Sean Red Mendoza | 2020-01751 | scmendoza5@up.edu.ph*

## Tools/ References

- [ImageBind](https://github.com/facebookresearch/ImageBind)

- [Gradio](https://gradio.app/)

- [roatienza/Deep-Learning-Experiments](https://github.com/roatienza/Deep-Learning-Experiments)

- [Google Cloud G2 GPU VM (2x Nvidia L4)](https://cloud.google.com/blog/products/compute/introducing-g2-vms-with-nvidia-l4-gpus)

## Goals

- [x] randomly pick an audio from the test split and classify it (audio player in UI)

- [x] user should be able to record his/her own voice for testing (audio recorder in UI, powered by Gradio)

- [x] show summary statistics during evaluation of n sampels (# of data points, accuracy).

- [x] comparison table of SOTA model scores

# Usage

1. Duplicate this repository on a working directory

```bash

git clone https://github.com/reddiedev/197z-kws

cd 197z-kws

```

2. Prepare environment for running the notebook

```bash

conda create --name kws

conda activate kws

sudo apt install ffmpeg

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

pip jupyter jupyterlab ipywidgets==7.6.5 install numpy ipython gradio ipywebrtc notebook

jupyter labextension install jupyter-webrtc 

```

3. Run the `demo.ipynb` jupyter notebook

4. View SOTA models comparison in `comparison.md`

# Acknowledgements

[Professor, Rowel Atienza](https://github.com/roatienza)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/reddiedev/197z-kws

Awesome Lists containing this project

README