Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tjmahr/alignphone

Aligning sentences at the phonetic level
https://github.com/tjmahr/alignphone

Last synced: 2 months ago
JSON representation

Aligning sentences at the phonetic level

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# alignphone

The goal of alignphone is to compute phoneme-by-phoneme alignments of two
strings of words.

## Installation

Install the development version from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("tjmahr/alignphone")
```

## Example

We want to align the phonemes from the sentence "the little boy went home" with
the sentence "the boy went home".

First, we create `alignment_utterance()` for each sentence. These provide an
orthographic version of the sentence and the constituent phonemes. Our lab
used/uses an ASCII substitute for IPA, so I am going to convert the provide the
phonemes in "wiscbet" and convert them to IPA before creating the alignment
utterance.

```{r example}
library(alignphone)

a <- wiscbet_to_ipa(
"dh", "4", " ",
"l", "I", ".", "t", "4", "l", " ",
"b", "cI", " ",
"r", "ae", "n", " ",
"h", "oU", "m"
) |>
alignment_utterance("the little boy went home")
a

b <- wiscbet_to_ipa(
"dh", "4", " ",
"b", "cI", " ",
"w", "E", "n", "t", " ",
"h", "oU", "m"
)|>
alignment_utterance("the boy went home")
b
```

Now, we can align the two utterances.

```{r}
ab1 <- align_phones(a, b)
ab1
```

The goal of the alignment algorithm is to create the best-scoring pairing of
phonemes from each utterance. Matching phonemes get positive points, and
mismatching phonemes get negative points. Here is what we see in the above
example:

- The initial /ð ə/ phonemes are exact matches and pair off perfectly. These
full-score pairings are indicated with `|` character.

- The space ` ` after the word "boy" in each sentence is also aligned.

- The word "little" is completely missing in one of the sentences but the word
"boy" is in both sentences. So, the alignment inserts gap characters (`-`)
as needed until the /b ɔɪ/ characters are aligned.

- "ran" and "went" are aligned so that the shared /n/ is fully scored.

- The phonemes /h o m/ are also aligned.

By default, the alignment only rewards exact matches (`|`), and gap insertion is
penalized. The `as.data.frame()` method for the alignment shows the alignment
with scores.

```{r}
as.data.frame(ab1)
```

A custom comparison function can be used for alignment. Note that the pairing of
the word boundary character (` `) or the syllable boundary (`.`) with a gap
(`-`) is still penalized. This package provides `phone_match_partial()` which
assigns partial credit based similar phonetic features, such as `w` and `r`.
Partial matches appear in the alignment as `:`.

```{r}
ab2 <- align_phones(a, b, fun_match = phone_match_partial)
ab2

as.data.frame(ab2)
```

Here /r/ and /w/ as well as the two aligned mismatching vowels are given partial
credit.

### indel penalties

Above, we saw that changing the phoneme-similarity scoring system yielded a
different alignment. The other contributor for alignment scoring are "indel"
penalties. These are penalties for creating a gap (`indel_create` with a default
penalty of -2) and extending a gap (`indel_extend` with a default of -1).

Consider another example. A child speaker said "baby sock" and a listener
transcribed it as "we restock". With the default scoring rules
(`phone_match_exact`), we get the following alignment:

```{r}
a <- wiscbet_to_ipa("b", "eI", "b", "i", "s", "@", "k") |>
alignment_utterance("baby sock")
b <- wiscbet_to_ipa("w", "i", "r", "I", "s", "t", "@", "k") |>
alignment_utterance("we restock")
align_phones(a, b, phone_match_exact)
```

Which only inserts a single gap for `t`. If we lower the penalty for creating a
gap, we see that it tries to align the two /i/ vowels.

```{r}
align_phones(a, b, phone_match_exact, indel_create = -.5)
```

Things become pathological when we remove the penalty on gap extension.

```{r}
align_phones(a, b, phone_match_exact, indel_extend = 0)
```

To get the most phonetically plausible alignment, we can assign partial
credit.

```{r}
align_phones(a, b, phone_match_partial)
```

I am not sure what to do with boundaries. We can word and syllable boundaries
so that alignments will get credit for following matching the prosodic structure
of the words, but for this example, this adjustment doesn't matter.

```{r}
a <- wiscbet_to_ipa("b", "eI", ".", "b", "i", " ", "s", "@", "k") |>
alignment_utterance("baby sock")
b <- wiscbet_to_ipa("w", "i", " ", "r", "I", ".", "s", "t", "@", "k") |>
alignment_utterance("we restock")
align_phones(a, b, phone_match_exact)
align_phones(a, b, phone_match_partial)
```

## More tests

### pretty-buddy

This is an important benchmark. Given "pretty" for *buddy*, what sound
should align at the /b/"? The default approach pairs [b] and [r]. A partial
credit scheme that assigns -.6 to mismatched vowels and -.4 to mismatched
consonants will pair [b] and [r]. Our partial credit function assigns -.2 to
mismatches that are very similar. In this case, it aligns [b] and [p].

```{r}
buddy <- str_split_at_hyphens("b-^-d-i") |>
wiscbet_to_ipa() |>
alignment_utterance("buddy")
pretty <- str_split_at_hyphens("p-r-I-t-i") |>
wiscbet_to_ipa() |>
alignment_utterance("pretty")

align_phones(buddy, pretty)

align_phones(buddy, pretty, phone_match_partial)
align_phones(buddy, pretty, phone_match_partial) |> as.data.frame()
```

### Deleted words

This is okay.

```{r}
point_to_teddy <- clean_old_alignment_result("p-cI-n-t t-u t-E-d-i") |>
wiscbet_to_ipa() |>
alignment_utterance("point to teddy")
point_teddy <- clean_old_alignment_result("p-cI-n-t ----t-E-d-i") |>
wiscbet_to_ipa() |>
alignment_utterance("point teddy")

align_phones(point_to_teddy, point_teddy, phone_match_partial)
```

This treats *hotdogs* and *hot dogs* as a mismatch on the space. It shouldn't.

```{r}
a <- "dh-oU-z i-t dh-oU-z h-@-t-d-c-g-z s-u-n" |>
clean_old_alignment_result() |>
wiscbet_to_ipa() |>
alignment_utterance("those eat those hotdogs soon")
a

b <- "--------------------h-@-t d-c-g-z------" |>
clean_old_alignment_result() |>
wiscbet_to_ipa() |>
alignment_utterance("hot dogs")
b

r1 <- align_phones(a, b, phone_match_partial)
r1
```

This one works when there are word gap characters.

```{r}
he_wants_somebody_to_push_him <- clean_old_alignment_result(
"h-i w-^-n-t-s s-^-m-b-^-d-i t-u p-U-sh h-I-m"
) |>
wiscbet_to_ipa() |>
alignment_utterance("he wants somebody to push him")

he_wants_to_push_him <- clean_old_alignment_result(
"h-i w-^-n-t-s --------------t-u p-U-sh h-I-m"
) |>
wiscbet_to_ipa() |>
alignment_utterance("he wants to push him")

r1 <- align_phones(
he_wants_somebody_to_push_him,
he_wants_to_push_him,
phone_match_partial
)

r1
```

But when the gaps are removed, it does not work.

```{r}
remove_gaps <- function(x) {
gaps <- x$phones %in% c(" ", ".")
x$phones <- x$phones[!gaps]
x
}

r1 <- align_phones(
he_wants_somebody_to_push_him |> remove_gaps(),
he_wants_to_push_him |> remove_gaps(),
phone_match_partial
)
r1
```

### the swift benchmark

I really want this one to match `b`-`v` and `li`-`lI`, but I have to ignore word
boundaries entirely. I need to figure out how to favor the kind of alignment.

```{r}
x <- wiscbet_to_ipa(
"l", "@", "ng",
"l", "I", "s", "t",
"^", "v",
"E", "k", "s",
"l", "^", "v", "4^", "z"
)

y <- wiscbet_to_ipa(
"l", "oU", "n", "l", "i",
"s", "t", "@", "r", "b", "^", "k", "s",
"l", "^", "v", "4^", "z"
)

z <- align_phones(x, y, phone_match_partial) |>
set_utterance_labels(
"long list of ex-lovers",
"lonely starbucks lovers"
) |>
print()

x2 <- wiscbet_to_ipa(
"l", "@", "ng", " ",
"l", "I", "s", "t", " ",
"^", "v", " ",
"E", "k", "s", " ",
"l", "^", ".", "v", "4^", "z"
)

y2 <- wiscbet_to_ipa(
"l", "oU", "n", ".", "l", "i", " ",
"s", "t", "@", "r", ".", "b", "^", "k", "s", " ",
"l", "^", ".", "v", "4^", "z"
)

z <- align_phones(x2, y2, phone_match_partial) |>
set_utterance_labels(
"long list of ex-lovers",
"lonely starbucks lovers"
) |>
print()

```