https://github.com/slrwndqhr18/vit-detecting-deepfake-audio

deepfake-detection machine-learning python pytorch transformer vit

Last synced: 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/slrwndqhr18/vit-detecting-deepfake-audio
Owner: slrwndqhr18
Created: 2025-02-04T19:06:44.000Z (3 months ago)
Default Branch: main
Last Pushed: 2025-02-04T23:22:11.000Z (3 months ago)
Last Synced: 2025-02-21T05:16:33.906Z (2 months ago)
Topics: deepfake-detection, machine-learning, python, pytorch, transformer, vit
Language: Python
Homepage:
Size: 21.5 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Detecting deepfake audio using VIT

This project is about detecting deepfake audio using attention algorithem.

idea source
paper: https://ieeexplore.ieee.org/document/10197715
by: Guzin Ulutas

## Model architecture

[Fig1. ML Architecture]
In this project I designed simple but effective structure. The main logic is 3 things.
- Converting audio data to CQT image had conducted in parallel programing.
- VIT model is mixed with LoRA model.
- Training model process has amp code.

------------------------------
## About Code

I used VIT as base model and added LoRA for accuracy.
Important thing in here is you need to convert audio data into the CQT image. this process will be done at preprocessing layer.

|File name|description|
|------|---|
| /model | Every codes related to ML model |
| /model/Component | Codes that used inside the ML model.Its a component of ML model like attention alg or FN layer. |
| handleConfig.py | load CONFIG.yaml and setup every parameters |
| handleDataset.py | By using PyTorch Dataloader, revise and format preprocessed data into PyTorch data structure |
| handlePreprocess | Load raw dataset and execure preprocessing |
| makeGraph | Just making graph to see the result. Nothing important in here. |
| handleMultiPs | Making the CQT file using multiprocessing. |

- main.py is the entry point of process. Just run main.py
- /model/Executor.py is the entry point of ML model. It is a class to load params and define model train / test / run process.
- For the reduction of preprocessing time, parallel programin is used in here.
- There is the code about amp, and this reduced training time alot.

## Result

About 95% accuracy, but there is a problem in here.
The scheduler that I used in here is not optimized. It works in low epoch number, but it causes error if the number of epoch is high.So using the same code is ok, but before to use this, I should find more great schduler.

Test was conducted epoch=8, and the result was great.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/slrwndqhr18/vit-detecting-deepfake-audio

Awesome Lists containing this project

README