https://github.com/desh2608/pytorch-tdnn
Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training
https://github.com/desh2608/pytorch-tdnn
Last synced: 2 months ago
JSON representation
Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training
- Host: GitHub
- URL: https://github.com/desh2608/pytorch-tdnn
- Owner: desh2608
- License: apache-2.0
- Created: 2020-12-12T23:44:07.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2020-12-18T22:29:00.000Z (over 4 years ago)
- Last Synced: 2025-03-18T09:21:25.788Z (2 months ago)
- Language: Python
- Size: 15.6 KB
- Stars: 39
- Watchers: 3
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# pytorch-tdnn
Implementation of Time Delay Neural Network (TDNN) and Factorized TDNN (TDNN-F)
in PyTorch, available as layers which can be used directly.### Setup
For using (no development required)
```bash
pip install pytorch-tdnn
```To install for development, clone the repository, and then run the following from
within the root directory.```bash
pip install -e .
```### Usage
#### Using the TDNN layer
```python
from pytorch_tdnn.tdnn import TDNN as TDNNLayertdnn = TDNNLayer(
512, # input dim
512, # output dim
[-3,0,3], # context
)y = tdnn(x)
```Here, `x` should have the shape `(batch_size, input_dim, sequence_length)`.
**Note:** The `context` list should follow these constraints:
* The length of the list should be 2 or an odd number.
* If the length is 2, it should be of the form `[-1,1]` or `[-3,3]`, but not
`[-1,3]`, for example.
* If the length is an odd number, they should be evenly spaced with a 0 in the
middle. For example, `[-3,0,3]` is allowed, but `[-3,-1,0,1,3]` is not.#### Using the TDNNF layer
```python
from pytorch_tdnn.tdnnf import TDNNF as TDNNFLayertdnnf = TDNNFLayer(
512, # input dim
512, # output dim
256, # bottleneck dim
1, # time stride
)y = tdnnf(x, semi_ortho_step=True)
```The argument `semi_ortho_step` determines whether to take the step towards semi-
orthogonality for the constrained convolutional layers in the 3-stage splicing.
If this call is made from within a `forward()` function of an
`nn.Module` class, it can be set as follows to approximate Kaldi-style training
where the step is taken once every 4 iterations:```python
import random
semi_ortho_step = self.training and (random.uniform(0,1) < 0.25)
```**Note:** Time stride should be greater than or equal to 0. For example, if
the time stride is 1, a context of `[-1,1]` is used for each stage of splicing.### Credits
* The TDNN implementation is based on: https://github.com/jonasvdd/TDNN and https://github.com/m-wiesner/nnet_pytorch.
* Semi-orthogonal convolutions used in TDNN-F are based on: https://github.com/cvqluu/Factorized-TDNN.
* Thanks to [Matthew Wiesner](https://github.com/m-wiesner) for helpful discussions
about the implementations.This repository aims to wrap up these implementations in easy-installable PyPi
packages, which can be used directly in PyTorch based neural network training.### Issues
If you find any bugs in the code, please raise an Issue, or email me at
`[email protected]`.