Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/tjmahr/alignphone

Aligning sentences at the phonetic level
https://github.com/tjmahr/alignphone
Last synced: 2 months ago
JSON representation
Aligning sentences at the phonetic level
Host: GitHub
URL: https://github.com/tjmahr/alignphone
Owner: tjmahr
License: gpl-3.0
Created: 2018-10-01T13:18:28.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2024-10-31T14:06:19.000Z (2 months ago)
Last Synced: 2024-10-31T15:19:25.440Z (2 months ago)
Language: R
Homepage:
Size: 81.1 KB
Stars: 3
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project

README

        ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

```

# alignphone

The goal of alignphone is to compute phoneme-by-phoneme alignments of two

strings of words.

## Installation

Install the development version from [GitHub](https://github.com/) with:

``` r

# install.packages("devtools")

devtools::install_github("tjmahr/alignphone")

```

## Example

We want to align the phonemes from the sentence "the little boy went home" with

the sentence "the boy went home".

First, we create `alignment_utterance()` for each sentence. These provide an

orthographic version of the sentence and the constituent phonemes. Our lab

used/uses an ASCII substitute for IPA, so I am going to convert the provide the

phonemes in "wiscbet" and convert them to IPA before creating the alignment 

utterance.

```{r example}

library(alignphone)

a <- wiscbet_to_ipa(

  "dh", "4", " ", 

  "l", "I", ".", "t", "4", "l", " ", 

  "b", "cI", " ", 

  "r", "ae", "n", " ",  

  "h", "oU", "m"

) |> 

  alignment_utterance("the little boy went home")

a

b <- wiscbet_to_ipa(

  "dh", "4", " ", 

  "b", "cI", " ", 

  "w", "E", "n", "t", " ", 

  "h", "oU", "m"

)|> 

  alignment_utterance("the boy went home")

b

```

Now, we can align the two utterances.

```{r}

ab1 <- align_phones(a, b)

ab1

```

The goal of the alignment algorithm is to create the best-scoring pairing of

phonemes from each utterance. Matching phonemes get positive points, and

mismatching phonemes get negative points. Here is what we see in the above

example:

  - The initial /ð ə/ phonemes are exact matches and pair off perfectly. These

    full-score pairings are indicated with `|` character.

    

  - The space ` ` after the word "boy" in each sentence is also aligned.

  

  - The word "little" is completely missing in one of the sentences but the word

    "boy" is in both sentences. So, the alignment inserts gap characters (`-`)

    as needed until the /b ɔɪ/ characters are aligned.

    

  - "ran" and "went" are aligned so that the shared /n/ is fully scored.

  

  - The phonemes /h o m/ are also aligned.

By default, the alignment only rewards exact matches (`|`), and gap insertion is

penalized. The `as.data.frame()` method for the alignment shows the alignment 

with scores.

```{r}

as.data.frame(ab1)

```

A custom comparison function can be used for alignment. Note that the pairing of

the word boundary character (`   `) or the syllable boundary (`.`) with a gap

(`-`) is still penalized. This package provides `phone_match_partial()` which

assigns partial credit based similar phonetic features, such as `w` and `r`.

Partial matches appear in the alignment as `:`. 

```{r}

ab2 <- align_phones(a, b, fun_match = phone_match_partial)

ab2

as.data.frame(ab2)

```

Here /r/ and /w/ as well as the two aligned mismatching vowels are given partial

credit.

### indel penalties

Above, we saw that changing the phoneme-similarity scoring system yielded a

different alignment. The other contributor for alignment scoring are "indel"

penalties. These are penalties for creating a gap (`indel_create` with a default

penalty of -2) and extending a gap (`indel_extend` with a default of -1).

Consider another example. A child speaker said "baby sock" and a listener

transcribed it as "we restock". With the default scoring rules

(`phone_match_exact`), we get the following alignment:

```{r}

a <- wiscbet_to_ipa("b", "eI", "b", "i", "s", "@", "k") |> 

  alignment_utterance("baby sock")

b <- wiscbet_to_ipa("w", "i", "r", "I", "s", "t", "@", "k") |> 

  alignment_utterance("we restock")

align_phones(a, b, phone_match_exact)

```

Which only inserts a single gap for `t`. If we lower the penalty for creating a 

gap, we see that it tries to align the two /i/ vowels.

```{r}

align_phones(a, b, phone_match_exact, indel_create = -.5)

```

Things become pathological when we remove the penalty on gap extension.

```{r}

align_phones(a, b, phone_match_exact, indel_extend = 0)

```

To get the most phonetically plausible alignment, we can assign partial

credit.

```{r}

align_phones(a, b, phone_match_partial)

```

I am not sure what to do with boundaries. We can word and syllable boundaries 

so that alignments will get credit for following matching the prosodic structure

of the words, but for this example, this adjustment doesn't matter.

```{r}

a <- wiscbet_to_ipa("b", "eI", ".", "b", "i", " ", "s", "@", "k") |> 

  alignment_utterance("baby sock")

b <- wiscbet_to_ipa("w", "i", " ", "r", "I", ".", "s", "t", "@", "k") |> 

  alignment_utterance("we restock")

align_phones(a, b, phone_match_exact)

align_phones(a, b, phone_match_partial)

```

## More tests

### pretty-buddy

This is an important benchmark. Given "pretty" for *buddy*, what sound

should align at the /b/"? The default approach pairs [b] and [r]. A partial

credit scheme that assigns -.6 to mismatched vowels and -.4 to mismatched

consonants will pair [b] and [r]. Our partial credit function assigns -.2 to

mismatches that are very similar. In this case, it aligns [b] and [p].

```{r}

buddy <- str_split_at_hyphens("b-^-d-i") |> 

  wiscbet_to_ipa() |> 

  alignment_utterance("buddy")

pretty <- str_split_at_hyphens("p-r-I-t-i") |> 

  wiscbet_to_ipa() |> 

  alignment_utterance("pretty")

align_phones(buddy, pretty)

align_phones(buddy, pretty, phone_match_partial)

align_phones(buddy, pretty, phone_match_partial) |> as.data.frame()

```

### Deleted words

This is okay. 

```{r}

point_to_teddy <- clean_old_alignment_result("p-cI-n-t t-u t-E-d-i") |> 

  wiscbet_to_ipa() |> 

  alignment_utterance("point to teddy")

point_teddy <- clean_old_alignment_result("p-cI-n-t ----t-E-d-i") |> 

  wiscbet_to_ipa() |> 

  alignment_utterance("point teddy")

align_phones(point_to_teddy, point_teddy, phone_match_partial)

```

This treats *hotdogs* and *hot dogs* as a mismatch on the space. It shouldn't.

```{r}

a <- "dh-oU-z i-t dh-oU-z h-@-t-d-c-g-z s-u-n" |> 

  clean_old_alignment_result() |> 

  wiscbet_to_ipa() |> 

  alignment_utterance("those eat those hotdogs soon")

a

b <- "--------------------h-@-t  d-c-g-z------" |> 

  clean_old_alignment_result() |> 

  wiscbet_to_ipa() |> 

  alignment_utterance("hot dogs")

b

r1 <- align_phones(a, b, phone_match_partial)

r1

```

This one works when there are word gap characters.

```{r}

he_wants_somebody_to_push_him <- clean_old_alignment_result(

  "h-i w-^-n-t-s s-^-m-b-^-d-i t-u p-U-sh h-I-m"

) |> 

  wiscbet_to_ipa() |> 

  alignment_utterance("he wants somebody to push him")  

he_wants_to_push_him <- clean_old_alignment_result(

  "h-i w-^-n-t-s --------------t-u p-U-sh h-I-m"

) |> 

  wiscbet_to_ipa() |> 

  alignment_utterance("he wants to push him")  

r1 <- align_phones(

  he_wants_somebody_to_push_him, 

  he_wants_to_push_him, 

  phone_match_partial

)

r1

```

But when the gaps are removed, it does not work.

```{r}

remove_gaps <- function(x) {

  gaps <- x$phones %in% c(" ", ".")

  x$phones <- x$phones[!gaps]

  x

}

r1 <- align_phones(

  he_wants_somebody_to_push_him |> remove_gaps(), 

  he_wants_to_push_him |> remove_gaps(), 

  phone_match_partial

)

r1

```

### the swift benchmark

I really want this one to match `b`-`v` and `li`-`lI`, but I have to ignore word

boundaries entirely. I need to figure out how to favor the kind of alignment.

```{r}

x <- wiscbet_to_ipa(

    "l", "@", "ng", 

    "l", "I", "s", "t", 

    "^", "v", 

    "E", "k", "s", 

    "l", "^", "v", "4^", "z"

  )

y <- wiscbet_to_ipa(

    "l", "oU", "n", "l", "i", 

    "s", "t", "@", "r", "b", "^", "k", "s", 

    "l", "^", "v", "4^", "z"

  )

z <- align_phones(x, y, phone_match_partial) |> 

  set_utterance_labels(

    "long list of ex-lovers",

    "lonely starbucks lovers"

  ) |> 

  print() 

x2 <- wiscbet_to_ipa(

    "l", "@", "ng", " ",

    "l", "I", "s", "t", " ",

    "^", "v", " ",

    "E", "k", "s", " ", 

    "l", "^", ".", "v", "4^", "z"

  )

y2 <- wiscbet_to_ipa(

    "l", "oU", "n", ".", "l", "i", " ",

    "s", "t", "@", "r", ".", "b", "^", "k", "s", " ",

    "l", "^", ".", "v", "4^", "z"

  )

z <- align_phones(x2, y2, phone_match_partial) |> 

  set_utterance_labels(

    "long list of ex-lovers",

    "lonely starbucks lovers"

  ) |> 

  print() 

```