Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/piotrkawa/deepfake-whisper-features

Implementation of the paper "Improved DeepFake Detection Using Whisper Features"
https://github.com/piotrkawa/deepfake-whisper-features

audio-deepfake-detection deep-learning deepfake-detection paper-implementations whisper

Last synced: 17 days ago
JSON representation

Implementation of the paper "Improved DeepFake Detection Using Whisper Features"

Host: GitHub
URL: https://github.com/piotrkawa/deepfake-whisper-features
Owner: piotrkawa
License: mit
Created: 2023-05-26T06:53:56.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-04-28T07:38:48.000Z (7 months ago)
Last Synced: 2024-07-30T21:03:15.804Z (3 months ago)
Topics: audio-deepfake-detection, deep-learning, deepfake-detection, paper-implementations, whisper
Language: Python
Homepage:
Size: 32.2 KB
Stars: 79
Watchers: 2
Forks: 5
Open Issues: 13
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Improved DeepFake Detection Using Whisper Features

The following repository contains code for our paper called "Improved DeepFake Detection Using Whisper Features".

The paper is available [here](https://www.isca-speech.org/archive/interspeech_2023/kawa23b_interspeech.html).

## Before you start

### Whisper
To download Whisper encoder used in training run `download_whisper.py`.

### Datasets

Download appropriate datasets:
* [ASVspoof2021 DF subset](https://zenodo.org/record/4835108) (**Please note:** we use this [keys&metadata file](https://www.asvspoof.org/resources/DF-keys-stage-1.tar.gz), directory structure is explained [here](https://github.com/piotrkawa/deepfake-whisper-features/issues/7#issuecomment-1830109945)),
* [In-The-Wild dataset](https://deepfake-demo.aisec.fraunhofer.de/in_the_wild).

### Dependencies
Install required dependencies using (we assume you're using conda and the target env is active):
```bash
bash install.sh
```

List of requirements:
```
python=3.8
pytorch==1.11.0
torchaudio==0.11
asteroid-filterbanks==0.4.0
librosa==0.9.2
openai whisper (git+https://github.com/openai/whisper.git@7858aa9c08d98f75575035ecd6481f462d66ca27)
```

### Supported models

The following list concerns models and its names to select it supported by this repository:
* SpecRNet - `specrnet`,
* (Whisper) SpecRNet - `whisper_specrnet`,
* (Whisper + LFCC/MFCC) SpecRNet - `whisper_frontend_specrnet`,
* LCNN - `lcnn`,
* (Whisper) LCNN - `whisper_lcnn`,
* (Whisper + LFCC/MFCC) LCNN -`whisper_frontend_lcnn`,
* MesoNet - `mesonet`,
* (Whisper) MesoNet - `whisper_mesonet`,
* (Whisper + LFCC/MFCC) MesoNet - `whisper_frontend_mesonet`,
* RawNet3 - `rawnet3`.

To select appropriate front-end please specify it in the config file.

### Pretrained models

All models reported in paper are available [here](https://drive.google.com/drive/folders/1YWMC64MW4HjGUX1fnBaMkMIGgAJde9Ch?usp=sharing).

### Configs

Both training and evaluation scripts are configured with the use of CLI and `.yaml` configuration files.
e.g.:
```yaml
data:
seed: 42

checkpoint:
path: "trained_models/lcnn/ckpt.pth",

model:
name: "lcnn"
parameters:
input_channels: 1
frontend_algorithm: ["lfcc"]
optimizer:
lr: 0.0001
weight_decay: 0.0001
```

Other example configs are available under `configs/training/`.

## Full train and test pipeline

To perform full pipeline of training and testing please use `train_and_test.py` script.

```
usage: train_and_test.py [-h] [--asv_path ASV_PATH] [--in_the_wild_path IN_THE_WILD_PATH] [--config CONFIG] [--train_amount TRAIN_AMOUNT] [--test_amount TEST_AMOUNT] [--batch_size BATCH_SIZE] [--epochs EPOCHS] [--ckpt CKPT] [--cpu]

Arguments:
--asv_path Path to the ASVSpoof2021 DF root dir
--in_the_wild_path Path to the In-The-Wild root dir
--config Path to the config file
--train_amount Number of samples to train on (default: 100000)
--valid_amount Number of samples to validate on (default: 25000)
--test_amount Number of samples to test on (default: None - all)
--batch_size Batch size (default: 8)
--epochs Number of epochs (default: 10)
--ckpt Path to saved models (default: 'trained_models')
--cpu Force using CPU
```

e.g.:
```bash
python train_and_test.py --asv_path ../datasets/deep_fakes/ASVspoof2021/DF --in_the_wild_path ../datasets/release_in_the_wild --config configs/training/whisper_specrnet.yaml --batch_size 8 --epochs 10 --train_amount 100000 --valid_amount 25000
```

## Finetune and test pipeline

To perform finetuning as presented in paper please use `train_and_test.py` script.

e.g.:
```
python train_and_test.py --asv_path ../datasets/deep_fakes/ASVspoof2021/DF --in_the_wild_path ../datasets/release_in_the_wild --config configs/finetuning/whisper_specrnet.yaml --batch_size 8 --epochs 5 --train_amount 100000 --valid_amount 25000
```
Please remember about decreasing the learning rate!

## Other scripts

To use separate scripts for training and evaluation please refer to respectively `train_models.py` and `evaluate_models.py`.

## Acknowledgments

We base our codebase on [Attack Agnostic Dataset repo](https://github.com/piotrkawa/attack-agnostic-dataset).
Apart from the dependencies mentioned in Attack Agnostic Dataset repository we also include:
* [RawNet3 implementation](https://github.com/Jungjee/RawNet).

## Citation

If you use this code in your research please use the following citation:
```
@inproceedings{kawa23b_interspeech,
author={Piotr Kawa and Marcin Plata and Michał Czuba and Piotr Szymański and Piotr Syga},
title={{Improved DeepFake Detection Using Whisper Features}},
year=2023,
booktitle={Proc. INTERSPEECH 2023},
pages={4009--4013},
doi={10.21437/Interspeech.2023-1537}
}
```