https://github.com/maxrmorrison/pypar
Phoneme alignment representation compatible with multiple forced aligners
https://github.com/maxrmorrison/pypar
alignment phoneme speech
Last synced: about 2 months ago
JSON representation
Phoneme alignment representation compatible with multiple forced aligners
- Host: GitHub
- URL: https://github.com/maxrmorrison/pypar
- Owner: maxrmorrison
- License: mit
- Created: 2020-11-10T17:50:40.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2024-04-12T22:54:45.000Z (over 1 year ago)
- Last Synced: 2025-06-10T19:50:52.219Z (4 months ago)
- Topics: alignment, phoneme, speech
- Language: Python
- Homepage:
- Size: 446 KB
- Stars: 21
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
Python phoneme alignment representation
[](https://pypi.python.org/pypi/pypar)
[](https://opensource.org/licenses/MIT)
[](https://pepy.tech/project/pypar)`pip install pypar`
Word and phoneme alignment representation for speech tasks. This repo does
not perform forced word or phoneme alignment, but provides an interface
for working with the resulting alignment of a forced aligner, such as
[`pyfoal`](https://github.com/maxrmorrison/pyfoal), or a manual alignment.## Table of contents
- [Usage](#usage)
* [Creating alignments](#creating-aligments)
* [Accessing words and phonemes](#accessing-words-and-phonemes)
* [Saving alignments](#saving-alignments)
- [Application programming interface (API)](#application-programming-interface-api)
* [`pypar.Alignment`](#pyparalignment)
* [`pypar.Alignment.__init__`](#pyparalignment__init__)
* [`pypar.Alignment.__add__`](#pyparalignment__add__)
* [`pypar.Alignment.__eq__`](#pyparalignment__eq__)
* [`pypar.Alignment.__getitem__`](#pyparalignment__getitem__)
* [`pypar.Alignment.__len__`](#pyparalignment__len__)
* [`pypar.Alignment.__str__`](#pyparalignment__str__)
* [`pypar.Alignment.duration`](#pyparalignmentduration)
* [`pypar.Alignment.end`](#pyparalignmentend)
* [`pypar.Alignment.find`](#pyparalignmentfind)
* [`pypar.Alignment.framewise_phoneme_indices`](#pyparalignmentframewise_phoneme_indices)
* [`pypar.Alignment.phonemes`](#pyparalignmentphonemes)
* [`pypar.Alignment.phoneme_at_time`](#pyparalignmentphoneme_at_time)
* [`pypar.Alignment.phoneme_bounds`](#pyparalignmentphoneme_bounds)
* [`pypar.Alignment.save`](#pyparalignmentsave)
* [`pypar.Alignment.start`](#pyparalignmentstart)
* [`pypar.Alignment.update`](#pyparalignmentupdate)
* [`pypar.Alignment.words`](#pyparalignmentwords)
* [`pypar.Alignment.word_bounds`](#pyparalignmentword_bounds)
* [`pypar.Phoneme`](#pyparphoneme)
* [`pypar.Phoneme.__init__`](#pyparphoneme__init__)
* [`pypar.Phoneme.__eq__`](#pyparphoneme__eq__)
* [`pypar.Phoneme.__str__`](#pyparphoneme__str__)
* [`pypar.Phoneme.duration`](#pyparphonemeduration)
* [`pypar.Phoneme.end`](#pyparphonemeend)
* [`pypar.Phoneme.start`](#pyparphonemestart)
* [`pypar.Word`](#pyparword)
* [`pypar.Word.__init__`](#pyparword__init__)
* [`pypar.Word.__eq__`](#pyparword__eq__)
* [`pypar.Word.__getitem__`](#pyparword__getitem__)
* [`pypar.Word.__len__`](#pyparword__len__)
* [`pypar.Word.__str__`](#pyparword__str__)
* [`pypar.Word.duration`](#pyparwordduration)
* [`pypar.Word.end`](#pyparwordend)
* [`pypar.Word.phoneme_at_time`](#pyparwordphoneme_at_time)
* [`pypar.Word.start`](#pyparwordstart)
- [Tests](#tests)## Usage
### Creating alignments
If you already have the alignment saved to a `json`, `mlf`, or `TextGrid`
file, pass the name of the file. Valid examples of each format can be found in
`test/assets/`.```python
alignment = pypar.Alignment(file)
```Alignments can be created manually from `Word` and `Phoneme` objects. Start and
end times are given in seconds.```python
# Create a word from phonemes
word = pypar.Word(
'THE',
[pypar.Phoneme('DH', 0., .03), pypar.Phoneme('AH0', .03, .06)])# Create a silence
silence = pypar.Word(pypar.SILENCE, pypar.Phoneme(pypar.SILENCE, .06, .16))# Make an alignment
alignment = pypar.Alignment([word, silence])
```You can create a new alignment from existing alignments via slicing and
concatenation.```python
# Slice
first_two_words = alignment[:2]# Concatenate
alignment_with_repeat = first_two_words + alignment
```### Accessing words and phonemes
To retrieve a list of words in the alignment, use `alignment.words()`.
To retrieve a list of phonemes, use `alignment.phonemes()`. The `Alignment`,
`Word`, and `Phoneme` objects all define `.start()`, `.end()`, and
`.duration()` methods, which return the start time, end time, and duration,
respectively. All times are given in units of seconds. These objects also
define equality checks via `==`, casting to string with `str()`, and iteration
as follows.```python
# Iterate over words
for word in alignment:# Access start and end times
assert word.duration() == word.end() - word.start()# Iterate over phonemes in word
for phoneme in word:# Access string representation
assert isinstance(str(phoneme), str)
```To access a word or phoneme at a specific time, pass the time in seconds to
`alignment.word_at_time` or `alignment.phoneme_at_time`.To retrieve the frame indices of the start and end of a word or phoneme, pass
the audio sampling rate and hopsize (in samples) to `alignment.word_bounds` or
`alignment.phoneme_bounds`.### Saving alignments
To save an alignment to disk, use `alignment.save(file)`, where `file` is the
desired filename. `pypar` currently supports saving as a `json` or `TextGrid`
file.## Application programming interface (API)
### `pypar.Alignment`
#### `pypar.Alignment.__init__`
```python
def __init__(
self,
alignment: Union[str, bytes, os.PathLike, List[pypar.Word], dict]
) -> None:
"""Create alignmentArguments
alignment
The filename, list of words, or json dict of the alignment
"""
```#### `pypar.Alignment.__add__`
```python
def __add__(self, other):
"""Add alignments by concatenationArguments
other
The alignment to compare toReturns
The concatenated alignment
"""
```#### `pypar.Alignment.__eq__`
```python
def __eq__(self, other) -> bool:
"""Equality comparison for alignmentsArguments
other
The alignment to compare toReturns
Whether the alignments are equal
"""
```#### `pypar.Alignment.__getitem__`
```python
def __getitem__(self, idx: Union[int, slice]) -> pypar.Word:
"""Retrieve the idxth wordArguments
idx
The index of the word to retrieveReturns
The word at index idx
"""
```#### `pypar.Alignment.__len__`
```python
def __len__(self) -> int:
"""Retrieve the number of wordsReturns
The number of words in the alignment
"""
```#### `pypar.Alignment.__str__`
```python
def __str__(self) -> str:
"""Retrieve the textReturns
The words in the alignment separated by spaces
"""
```#### `pypar.Alignment.duration`
```python
def duration(self) -> float:
"""Retrieve the duration of the alignment in secondsReturns
The duration in seconds
"""
```#### `pypar.Alignment.end`
```python
def end(self) -> float:
"""Retrieve the end time of the alignment in secondsReturns
The end time in seconds
"""
```#### `pypar.Alignment.framewise_phoneme_indices`
```python
def framewise_phoneme_indices(
self,
phoneme_map: Dict[str, int],
hopsize: float,
times: Optional[List[float]] = None
) -> List[int]:
"""Convert alignment to phoneme indices at regular temporal intervalArguments
phoneme_map
Mapping from phonemes to indices
hopsize
Temporal interval between frames in seconds
times
Specified times in seconds to sample phonemes
"""
```#### `pypar.Alignment.find`
```python
def find(self, words: str) -> int:
"""Find the words in the alignmentArguments
words
The words to findReturns
The index of the start of the words or -1 if not found
"""
```#### `pypar.Alignment.phonemes`
```python
def phonemes(self) -> List[pypar.Phoneme]:
"""Retrieve the phonemes in the alignmentReturns
The phonemes in the alignment
"""
```#### `pypar.Alignment.phoneme_at_time`
```python
def phoneme_at_time(self, time: float) -> Optional[pypar.Phoneme]:
"""Retrieve the phoneme spoken at specified timeArguments
time
Time in secondsReturns
The phoneme at the given time (or None if time is out of bounds)
"""
```#### `pypar.Alignment.phoneme_bounds`
```python
def phoneme_bounds(
self,
sample_rate: int,
hopsize: int = 1
) -> List[Tuple[int, int]]:
"""Retrieve the start and end frame index of each phonemeArguments
sample_rate
The audio sampling rate
hopsize
The number of samples between successive framesReturns
The start and end indices of the phonemes
"""
```#### `pypar.Alignment.save`
```python
def save(self, filename: Union[str, bytes, os.PathLike]) -> None:
"""Save alignment to jsonArguments
filename
The location on disk to save the phoneme alignment json
"""
```#### `pypar.Alignment.start`
```python
def start(self) -> float:
"""Retrieve the start time of the alignment in secondsReturns
The start time in seconds
"""
```#### `pypar.Alignment.update`
```python
def update(
self,
idx: int = 0,
durations: Optional[List[float]] = None,
start: Optional[float] = None
) -> None:
"""Update alignment starting from phoneme index idxArguments
idx
The index of the first phoneme whose duration is being updated
durations
The new phoneme durations, starting from idx
start
The start time of the alignment
"""
```#### `pypar.Alignment.words`
```python
def words(self) -> List[pypar.Word]:
"""Retrieve the words in the alignmentReturns
The words in the alignment
"""
```#### `pypar.Alignment.word_bounds`
```python
def word_at_time(self, time: float) -> Optional[pypar.Word]:
"""Retrieve the word spoken at specified timeArguments
time
Time in secondsReturns
The word spoken at the specified time
"""
```### `pypar.Phoneme`
#### `pypar.Phoneme.__init__`
```python
def __init__(self, phoneme: str, start: float, end: float) -> None:
"""Create phonemeArguments
phoneme
The phoneme
start
The start time in seconds
end
The end time in seconds
"""
```#### `pypar.Phoneme.__eq__`
```python
def __eq__(self, other) -> bool:
"""Equality comparison for phonemesArguments
other
The phoneme to compare toReturns
Whether the phonemes are equal
"""
```#### `pypar.Phoneme.__str__`
```python
def __str__(self) -> str:
"""Retrieve the phoneme textReturns
The phoneme
"""
```#### `pypar.Phoneme.duration`
```python
def duration(self) -> float:
"""Retrieve the phoneme durationReturns
The duration in seconds
"""
```#### `pypar.Phoneme.end`
```python
def end(self) -> float:
"""Retrieve the end time of the phoneme in secondsReturns
The end time in seconds
"""
```#### `pypar.Phoneme.start`
```python
def start(self) -> float:
"""Retrieve the start time of the phoneme in secondsReturns
The start time in seconds
"""
```### `pypar.Word`
#### `pypar.Word.__init__`
```python
def __init__(self, word: str, phonemes: List[pypar.Phoneme]) -> None:
"""Create wordArguments
word
The word
phonemes
The phonemes in the word
"""
```#### `pypar.Word.__eq__`
```python
def __eq__(self, other) -> bool:
"""Equality comparison for wordsArguments
other
The word to compare toReturns
Whether the words are the same
"""
```#### `pypar.Word.__getitem__`
```python
def __getitem__(self, idx: int) -> pypar.Phoneme:
"""Retrieve the idxth phonemeArguments
idx
The index of the phoneme to retrieveReturns
The phoneme at index idx
"""
```#### `pypar.Word.__len__`
```python
def __len__(self) -> int:
"""Retrieve the number of phonemesReturns
The number of phonemes
"""
```#### `pypar.Word.__str__`
```python
def __str__(self) -> str:
"""Retrieve the word textReturns
The word text
"""
```#### `pypar.Word.duration`
```python
def duration(self) -> float:
"""Retrieve the word duration in secondsReturns
The duration in seconds
"""
```#### `pypar.Word.end`
```python
def end(self) -> float:
"""Retrieve the end time of the word in secondsReturns
The end time in seconds
"""
```#### `pypar.Word.phoneme_at_time`
```python
def phoneme_at_time(self, time: float) -> Optional[pypar.Phoneme]:
"""Retrieve the phoneme at the specified timeArguments
time
Time in secondsReturns
The phoneme at the given time (or None if time is out of bounds)
"""
```#### `pypar.Word.start`
```python
def start(self) -> float:
"""Retrieve the start time of the word in secondsReturns
The start time in seconds
"""
```## Tests
Tests can be run as follows.
```
pip install pytest
pytest
```