https://github.com/maxrmorrison/pypar

Phoneme alignment representation compatible with multiple forced aligners
https://github.com/maxrmorrison/pypar
alignment phoneme speech
Last synced: about 2 months ago
JSON representation
Phoneme alignment representation compatible with multiple forced aligners
Host: GitHub
URL: https://github.com/maxrmorrison/pypar
Owner: maxrmorrison
License: mit
Created: 2020-11-10T17:50:40.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2024-04-12T22:54:45.000Z (over 1 year ago)
Last Synced: 2025-06-10T19:50:52.219Z (4 months ago)
Topics: alignment, phoneme, speech
Language: Python
Homepage:
Size: 446 KB
Stars: 21
Watchers: 4
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project

README

          
Python phoneme alignment representation



[![PyPI](https://img.shields.io/pypi/v/pypar.svg)](https://pypi.python.org/pypi/pypar)

[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)

[![Downloads](https://static.pepy.tech/badge/pypar)](https://pepy.tech/project/pypar)

`pip install pypar`



Word and phoneme alignment representation for speech tasks. This repo does

not perform forced word or phoneme alignment, but provides an interface

for working with the resulting alignment of a forced aligner, such as

[`pyfoal`](https://github.com/maxrmorrison/pyfoal), or a manual alignment.

## Table of contents

- [Usage](#usage)

    * [Creating alignments](#creating-aligments)

    * [Accessing words and phonemes](#accessing-words-and-phonemes)

    * [Saving alignments](#saving-alignments)

- [Application programming interface (API)](#application-programming-interface-api)

    * [`pypar.Alignment`](#pyparalignment)

        * [`pypar.Alignment.__init__`](#pyparalignment__init__)

        * [`pypar.Alignment.__add__`](#pyparalignment__add__)

        * [`pypar.Alignment.__eq__`](#pyparalignment__eq__)

        * [`pypar.Alignment.__getitem__`](#pyparalignment__getitem__)

        * [`pypar.Alignment.__len__`](#pyparalignment__len__)

        * [`pypar.Alignment.__str__`](#pyparalignment__str__)

        * [`pypar.Alignment.duration`](#pyparalignmentduration)

        * [`pypar.Alignment.end`](#pyparalignmentend)

        * [`pypar.Alignment.find`](#pyparalignmentfind)

        * [`pypar.Alignment.framewise_phoneme_indices`](#pyparalignmentframewise_phoneme_indices)

        * [`pypar.Alignment.phonemes`](#pyparalignmentphonemes)

        * [`pypar.Alignment.phoneme_at_time`](#pyparalignmentphoneme_at_time)

        * [`pypar.Alignment.phoneme_bounds`](#pyparalignmentphoneme_bounds)

        * [`pypar.Alignment.save`](#pyparalignmentsave)

        * [`pypar.Alignment.start`](#pyparalignmentstart)

        * [`pypar.Alignment.update`](#pyparalignmentupdate)

        * [`pypar.Alignment.words`](#pyparalignmentwords)

        * [`pypar.Alignment.word_bounds`](#pyparalignmentword_bounds)

    * [`pypar.Phoneme`](#pyparphoneme)

        * [`pypar.Phoneme.__init__`](#pyparphoneme__init__)

        * [`pypar.Phoneme.__eq__`](#pyparphoneme__eq__)

        * [`pypar.Phoneme.__str__`](#pyparphoneme__str__)

        * [`pypar.Phoneme.duration`](#pyparphonemeduration)

        * [`pypar.Phoneme.end`](#pyparphonemeend)

        * [`pypar.Phoneme.start`](#pyparphonemestart)

    * [`pypar.Word`](#pyparword)

        * [`pypar.Word.__init__`](#pyparword__init__)

        * [`pypar.Word.__eq__`](#pyparword__eq__)

        * [`pypar.Word.__getitem__`](#pyparword__getitem__)

        * [`pypar.Word.__len__`](#pyparword__len__)

        * [`pypar.Word.__str__`](#pyparword__str__)

        * [`pypar.Word.duration`](#pyparwordduration)

        * [`pypar.Word.end`](#pyparwordend)

        * [`pypar.Word.phoneme_at_time`](#pyparwordphoneme_at_time)

        * [`pypar.Word.start`](#pyparwordstart)

- [Tests](#tests)

## Usage

### Creating alignments

If you already have the alignment saved to a `json`, `mlf`, or `TextGrid`

file, pass the name of the file. Valid examples of each format can be found in

`test/assets/`.

```python

alignment = pypar.Alignment(file)

```

Alignments can be created manually from `Word` and `Phoneme` objects. Start and

end times are given in seconds.

```python

# Create a word from phonemes

word = pypar.Word(

    'THE',

    [pypar.Phoneme('DH', 0., .03), pypar.Phoneme('AH0', .03, .06)])

# Create a silence

silence = pypar.Word(pypar.SILENCE, pypar.Phoneme(pypar.SILENCE, .06, .16))

# Make an alignment

alignment = pypar.Alignment([word, silence])

```

You can create a new alignment from existing alignments via slicing and

concatenation.

```python

# Slice

first_two_words = alignment[:2]

# Concatenate

alignment_with_repeat = first_two_words + alignment

```

### Accessing words and phonemes

To retrieve a list of words in the alignment, use `alignment.words()`.

To retrieve a list of phonemes, use `alignment.phonemes()`. The `Alignment`,

`Word`, and `Phoneme` objects all define `.start()`, `.end()`, and

`.duration()` methods, which return the start time, end time, and duration,

respectively. All times are given in units of seconds. These objects also

define equality checks via `==`, casting to string with `str()`, and iteration

as follows.

```python

# Iterate over words

for word in alignment:

    # Access start and end times

    assert word.duration() == word.end() - word.start()

    # Iterate over phonemes in word

    for phoneme in word:

        # Access string representation

        assert isinstance(str(phoneme), str)

```

To access a word or phoneme at a specific time, pass the time in seconds to

`alignment.word_at_time` or `alignment.phoneme_at_time`.

To retrieve the frame indices of the start and end of a word or phoneme, pass

the audio sampling rate and hopsize (in samples) to `alignment.word_bounds` or

`alignment.phoneme_bounds`.

### Saving alignments

To save an alignment to disk, use `alignment.save(file)`, where `file` is the

desired filename. `pypar` currently supports saving as a `json` or `TextGrid`

file.

## Application programming interface (API)

### `pypar.Alignment`

#### `pypar.Alignment.__init__`

```python

def __init__(

    self,

    alignment: Union[str, bytes, os.PathLike, List[pypar.Word], dict]

) -> None:

    """Create alignment

    Arguments

        alignment

            The filename, list of words, or json dict of the alignment

    """

```

#### `pypar.Alignment.__add__`

```python

def __add__(self, other):

    """Add alignments by concatenation

    Arguments

        other

            The alignment to compare to

    Returns

        The concatenated alignment

    """

```

#### `pypar.Alignment.__eq__`

```python

def __eq__(self, other) -> bool:

    """Equality comparison for alignments

    Arguments

        other

            The alignment to compare to

    Returns

        Whether the alignments are equal

    """

```

#### `pypar.Alignment.__getitem__`

```python

def __getitem__(self, idx: Union[int, slice]) -> pypar.Word:

    """Retrieve the idxth word

    Arguments

        idx

            The index of the word to retrieve

    Returns

        The word at index idx

    """

```

#### `pypar.Alignment.__len__`

```python

def __len__(self) -> int:

    """Retrieve the number of words

    Returns

        The number of words in the alignment

    """

```

#### `pypar.Alignment.__str__`

```python

def __str__(self) -> str:

    """Retrieve the text

    Returns

        The words in the alignment separated by spaces

    """

```

#### `pypar.Alignment.duration`

```python

def duration(self) -> float:

    """Retrieve the duration of the alignment in seconds

    Returns

        The duration in seconds

    """

```

#### `pypar.Alignment.end`

```python

def end(self) -> float:

    """Retrieve the end time of the alignment in seconds

    Returns

        The end time in seconds

    """

```

#### `pypar.Alignment.framewise_phoneme_indices`

```python

def framewise_phoneme_indices(

    self,

    phoneme_map: Dict[str, int],

    hopsize: float,

    times: Optional[List[float]] = None

) -> List[int]:

    """Convert alignment to phoneme indices at regular temporal interval

    Arguments

        phoneme_map

            Mapping from phonemes to indices

        hopsize

            Temporal interval between frames in seconds

        times

            Specified times in seconds to sample phonemes

    """

```

#### `pypar.Alignment.find`

```python

def find(self, words: str) -> int:

    """Find the words in the alignment

    Arguments

        words

            The words to find

    Returns

        The index of the start of the words or -1 if not found

    """

```

#### `pypar.Alignment.phonemes`

```python

def phonemes(self) -> List[pypar.Phoneme]:

    """Retrieve the phonemes in the alignment

    Returns

        The phonemes in the alignment

    """

```

#### `pypar.Alignment.phoneme_at_time`

```python

def phoneme_at_time(self, time: float) -> Optional[pypar.Phoneme]:

    """Retrieve the phoneme spoken at specified time

    Arguments

        time

            Time in seconds

    Returns

        The phoneme at the given time (or None if time is out of bounds)

    """

```

#### `pypar.Alignment.phoneme_bounds`

```python

def phoneme_bounds(

    self,

    sample_rate: int,

    hopsize: int = 1

) -> List[Tuple[int, int]]:

    """Retrieve the start and end frame index of each phoneme

    Arguments

        sample_rate

            The audio sampling rate

        hopsize

            The number of samples between successive frames

    Returns

        The start and end indices of the phonemes

    """

```

#### `pypar.Alignment.save`

```python

def save(self, filename: Union[str, bytes, os.PathLike]) -> None:

    """Save alignment to json

    Arguments

        filename

            The location on disk to save the phoneme alignment json

    """

```

#### `pypar.Alignment.start`

```python

def start(self) -> float:

    """Retrieve the start time of the alignment in seconds

    Returns

        The start time in seconds

    """

```

#### `pypar.Alignment.update`

```python

def update(

    self,

    idx: int = 0,

    durations: Optional[List[float]] = None,

    start: Optional[float] = None

) -> None:

    """Update alignment starting from phoneme index idx

    Arguments

        idx

            The index of the first phoneme whose duration is being updated

        durations

            The new phoneme durations, starting from idx

        start

            The start time of the alignment

    """

```

#### `pypar.Alignment.words`

```python

def words(self) -> List[pypar.Word]:

    """Retrieve the words in the alignment

    Returns

        The words in the alignment

    """

```

#### `pypar.Alignment.word_bounds`

```python

def word_at_time(self, time: float) -> Optional[pypar.Word]:

    """Retrieve the word spoken at specified time

    Arguments

        time

            Time in seconds

    Returns

        The word spoken at the specified time

    """

```

### `pypar.Phoneme`

#### `pypar.Phoneme.__init__`

```python

def __init__(self, phoneme: str, start: float, end: float) -> None:

    """Create phoneme

    Arguments

        phoneme

            The phoneme

        start

            The start time in seconds

        end

            The end time in seconds

    """

```

#### `pypar.Phoneme.__eq__`

```python

def __eq__(self, other) -> bool:

    """Equality comparison for phonemes

    Arguments

        other

            The phoneme to compare to

    Returns

        Whether the phonemes are equal

    """

```

#### `pypar.Phoneme.__str__`

```python

def __str__(self) -> str:

    """Retrieve the phoneme text

    Returns

        The phoneme

    """

```

#### `pypar.Phoneme.duration`

```python

def duration(self) -> float:

    """Retrieve the phoneme duration

    Returns

        The duration in seconds

    """

```

#### `pypar.Phoneme.end`

```python

def end(self) -> float:

    """Retrieve the end time of the phoneme in seconds

    Returns

        The end time in seconds

    """

```

#### `pypar.Phoneme.start`

```python

def start(self) -> float:

    """Retrieve the start time of the phoneme in seconds

    Returns

        The start time in seconds

    """

```

### `pypar.Word`

#### `pypar.Word.__init__`

```python

def __init__(self, word: str, phonemes: List[pypar.Phoneme]) -> None:

    """Create word

    Arguments

        word

            The word

        phonemes

            The phonemes in the word

    """

```

#### `pypar.Word.__eq__`

```python

def __eq__(self, other) -> bool:

    """Equality comparison for words

    Arguments

        other

            The word to compare to

    Returns

        Whether the words are the same

    """

```

#### `pypar.Word.__getitem__`

```python

def __getitem__(self, idx: int) -> pypar.Phoneme:

    """Retrieve the idxth phoneme

    Arguments

        idx

            The index of the phoneme to retrieve

    Returns

        The phoneme at index idx

    """

```

#### `pypar.Word.__len__`

```python

def __len__(self) -> int:

    """Retrieve the number of phonemes

    Returns

        The number of phonemes

    """

```

#### `pypar.Word.__str__`

```python

def __str__(self) -> str:

    """Retrieve the word text

    Returns

        The word text

    """

```

#### `pypar.Word.duration`

```python

def duration(self) -> float:

    """Retrieve the word duration in seconds

    Returns

        The duration in seconds

    """

```

#### `pypar.Word.end`

```python

def end(self) -> float:

    """Retrieve the end time of the word in seconds

    Returns

        The end time in seconds

    """

```

#### `pypar.Word.phoneme_at_time`

```python

def phoneme_at_time(self, time: float) -> Optional[pypar.Phoneme]:

    """Retrieve the phoneme at the specified time

    Arguments

        time

            Time in seconds

    Returns

        The phoneme at the given time (or None if time is out of bounds)

    """

```

#### `pypar.Word.start`

```python

    def start(self) -> float:

        """Retrieve the start time of the word in seconds

        Returns

            The start time in seconds

        """

```

## Tests

Tests can be run as follows.

```

pip install pytest

pytest

```
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/maxrmorrison/pypar

Awesome Lists containing this project

README

Python phoneme alignment representation