https://github.com/resemble-ai/monotonic_align
Monotonic Alignment Search
https://github.com/resemble-ai/monotonic_align
Last synced: 11 days ago
JSON representation
Monotonic Alignment Search
- Host: GitHub
- URL: https://github.com/resemble-ai/monotonic_align
- Owner: resemble-ai
- License: mit
- Created: 2021-08-16T08:25:48.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2022-09-06T08:55:11.000Z (over 2 years ago)
- Last Synced: 2025-04-19T20:58:55.129Z (about 1 month ago)
- Language: Cython
- Size: 8.79 KB
- Stars: 91
- Watchers: 5
- Forks: 14
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Adapted from the MAS in [Glow-TTS](https://github.com/jaywalnut310/glow-tts/tree/master/monotonic_align). I made it installable and added variants.
# Installation
```
pip install git+https://github.com/resemble-ai/monotonic_align.git
```
Installing `monotonic_align` doesn't require torch, but using ``monotonic_align`` will.
Please install PyTorch yourself, as its installation differ from system to system.# How to Use
```python
# Suppose you have:
# 1. a probability matrix of size (batch_size=B, symbol_len=S, mel_lens=T)
# NOTE: a similarity matrix (a higher score means better) or negative cost will do
# but may have issues.
# 2. an array of symbol lengths `symbol_lens` of size (batch_size=B)
# 3. an array of mel-spectrogram lengths `mel_lens` of size (batch_size=B)from monotonic_align import mask_from_lens, maximum_path
mask_ST = mask_from_lens(similarity, symbol_lens, mel_lens)
alignment = maximum_path(similarity, mask_ST) # (B, S, T)# NOTE:
# - If `mask` is not specified, the default mask is `True` for all elements.
# - You can specify `topology` if you want to use other variants of alignment algorithms.
```