https://github.com/echocatzh/conv-stft

A STFT/iSTFT written up in PyTorch using 1D Convolutions
https://github.com/echocatzh/conv-stft

conv-stft padding stft

Last synced: 2 months ago
JSON representation

A STFT/iSTFT written up in PyTorch using 1D Convolutions

Host: GitHub
URL: https://github.com/echocatzh/conv-stft
Owner: echocatzh
License: mit
Created: 2020-08-17T18:38:08.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2024-07-09T17:55:12.000Z (over 1 year ago)
Last Synced: 2026-01-06T18:12:52.534Z (2 months ago)
Topics: conv-stft, padding, stft
Language: Python
Homepage:
Size: 188 KB
Stars: 32
Watchers: 2
Forks: 11
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Conv-STFT/iSTFT in PyTorch

NOTE!!!: This package will no longer be maintained, and the API calls in this repo are exactly the same in [torch-mfcc](https://github.com/echocatzh/torch-mfcc).

```python

from conv_stft import STFT # same as below

from torch_mfcc import STFT

```

Author: Shimin Zhang

The code refers to the following repo:

1. [remove modulation effects](https://github.com/pseeth/torch-stft) 

2. [enframe and conv-overlap-add](https://github.com/huyanxin/phasen/blob/master/model/conv_stft.py)

An STFT/iSTFT written up in PyTorch(py3) using 1D Convolutions. There are two window logic, `break` and `continue`.

- `break` - a kaldi-like framing method

When the parameters `win_len` and `fft_len` are different, padding `fft_len`-`win_len` zero points after each frame( len(frame) = `win_len` ), and the window ( len(window) = `win_len` ) always wise-multiply with frame before padding.

- `continue` - a librosa-like framing method.

When the parameters `win_len` and `fft_len` are different, framing the signal using `win_len`=`fft_len`, and zero padding on both sides of window ( len(window) = `win_len` ), which is `len(center_pad(window))=fft_len`

## Installation

Install easily with pip:`pip install conv_stft` or download this repo, `python setup.py install`.

## Usage

```python3

import torch

from conv_stft import STFT

import numpy as np

import librosa 

import matplotlib.pyplot as plt

audio = librosa.load(librosa.util.example_audio_file(), duration=10.0, offset=30)[0]

device = 'cpu'

fft_len = 1024

win_hop = 256

win_len = 1024

window = 'hann'

audio = torch.FloatTensor(audio)

audio = audio.unsqueeze(0)

audio = audio.to(device)

stft = STFT(

    fft_len=fft_len, 

    win_hop=win_hop, 

    win_len=win_len,

    win_type=window,

).to(device)

magnitude, phase = stft.transform(audio, return_type='magphase') # 'magphase' or 'realimag'

output = stft.inverse(magnitude, phase, input_type='magphase') # 'magphase' or 'realimag'

output = output.cpu().data.numpy()[..., :]

audio = audio.cpu().data.numpy()[..., :]

print(np.mean((output - audio) ** 2)) # on order of 1e-15

```

Output of [`compare_stft.py`](compare_stft.py):

![images/stft.png](images/stft.png)

## Tests

Test it by just cloning this repo and running

    

```

pip install -r requirements.txt

python -m pytest .

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/echocatzh/conv-stft

Awesome Lists containing this project

README