https://github.com/awsaf49/tensorflow_extra
TensorFlow GPU & TPU compatible operations: MelSpectrogram, TimeFreqMask, CutMix, MixUp, ZScore, and more
https://github.com/awsaf49/tensorflow_extra
audio-processing comuter-vision tensorflow
Last synced: 5 months ago
JSON representation
TensorFlow GPU & TPU compatible operations: MelSpectrogram, TimeFreqMask, CutMix, MixUp, ZScore, and more
- Host: GitHub
- URL: https://github.com/awsaf49/tensorflow_extra
- Owner: awsaf49
- License: mit
- Created: 2022-04-06T13:21:07.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2023-09-29T05:49:52.000Z (over 2 years ago)
- Last Synced: 2025-11-28T03:40:21.130Z (7 months ago)
- Topics: audio-processing, comuter-vision, tensorflow
- Language: Python
- Homepage:
- Size: 895 KB
- Stars: 18
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Tensorflow Extra
> TensorFlow GPU & TPU compatible operations: MelSpectrogram, TimeFreqMask, CutMix, MixUp, ZScore, and more
# Installation
For Stable version
```shell
!pip install tensorflow-extra
```
or
For updated version
```shell
!pip install git+https://github.com/awsaf49/tensorflow_extra
```
# Usage
To check use case of this library, checkout [BirdCLEF23: Pretraining is All you Need](https://www.kaggle.com/code/awsaf49/birdclef23-pretraining-is-all-you-need-train) notebook. It uses this library along with **Multi Stage Transfer Learning** for Bird Call Identification task.
# Layers
## MelSpectrogram
Converts audio data to mel-spectrogram in GPU/TPU.
```py
import tensorflow_extra as tfe
audio2spec = tfe.layers.MelSpectrogram()
spec = audio2spec(audio)
```

## Time Frequency Masking
Can also control number of stripes.
```py
time_freq_mask = tfe.layers.TimeFreqMask()
spec = time_freq_mask(spec)
```

## CutMix
Can be used with audio, spec, image. For spec full freq resolution can be used using `full_height=True`.
```py
cutmix = tfe.layers.CutMix()
audio = cutmix(audio, training=True) # accepts both audio & spectrogram
```

## MixUp
Can be used with audio, spec, image. For spec full freq resolution can be used using `full_height=True`.
```py
mixup = tfe.layers.MixUp()
audio = mixup(audio, training=True) # accepts both audio & spectrogram
```

## Normalization
Applies standardization and rescaling.
```py
norm = tfe.layers.ZScoreMinMax()
spec = norm(spec)
```

# Activations
## SmeLU: Smooth ReLU
```py
import tensorflow as tf
import tensorflow_extra as tfe
a = tf.constant([-2.5, -1.0, 0.5, 1.0, 2.5])
b = tfe.activations.smelu(a) # array([0., 0.04166667, 0.6666667 , 1.0416666 , 2.5])
```