https://github.com/jcvasquezc/phonet

Keras-based python framework to compute phonological posterior probabilities from audio files
https://github.com/jcvasquezc/phonet

deep-learning deep-neural-networks linguistic-analysis linguistics phonetics speech-processing

Last synced: 6 months ago
JSON representation

Keras-based python framework to compute phonological posterior probabilities from audio files

Host: GitHub
URL: https://github.com/jcvasquezc/phonet
Owner: jcvasquezc
License: mit
Created: 2019-03-09T12:18:17.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2022-12-27T15:37:24.000Z (over 3 years ago)
Last Synced: 2025-09-25T00:54:53.660Z (10 months ago)
Topics: deep-learning, deep-neural-networks, linguistic-analysis, linguistics, phonetics, speech-processing
Language: Python
Homepage:
Size: 23 MB
Stars: 43
Watchers: 2
Forks: 20
Open Issues: 5
Metadata Files:
- Readme: README.rst
- License: LICENSE

Awesome Lists containing this project

README

          ======================

Phonet

======================

.. image:: https://readthedocs.org/projects/phonet/badge/?version=latest

:target: https://phonet.readthedocs.io/en/latest/?badge=latest

:alt: Documentation Status

.. image:: https://travis-ci.org/jcvasquezc/phonet.svg?branch=master

    :target: https://travis-ci.org/jcvasquezc/phonet

This toolkit compute posteriors probabilities of phonological classes from audio files for several groups of phonemes according to the mode and manner of articulation.

If you are not sure about what phonological classes are, have a look at this

`Phonological classes tutorial `_

`Project Documentation `_

`Paper `_

The list of the phonological classes available and the phonemes that are activated for each phonological class are observed in the following Table

The list of the phonological classes available and the phonemes that are activated for each phonological class are observed in the following Table

==================    ================================================================================

Phonological class    Phonemes

==================    ================================================================================

vocalic               /a/, /e/, /i/, /o/, /u/

consonantal           /b/, /tS/, /d/, /f/, /g/, /x/, /k/, /l/, /ʎ/, /m/, /n/, /p/, /ɾ/, /r/, /s/, /t/

back                  /a/, /o/, /u/

anterior              /e/, /i/

open                  /a/, /e/, /o/

close                 /i/, /u/

nasal                 /m/, /n/

stop                  /p/, /b/, /t/, /k/, /g/, /tS/, /d/

continuant            /f/, /b/, /tS/, /d/, /s/, /g/, /ʎ/, /x/

lateral               /l/

flap                  /ɾ/

trill                 /r/

voiced                /a/, /e/, /i/, /o/, /u/, /b/, /d/, /l/, /m/, /n/, /r/, /g/, /ʎ/

strident              /f/, /s/, /tS/

labial                /m/, /p/, /b/, /f/

dental                /t/, /d/

velar                 /k/, /g/, /x/

pause                 /sil/

==================    ================================================================================

Installation

============

From this repository::

    git clone https://github.com/jcvasquezc/phonet

    cd phonet

    python setup.py

Usage

=====

Supported features:

- Estimate probabilities of phonological classes for an audio file

`Example use `_

Estimation of phonological classes

====================================

Estimate the phonological classes using the BGRU models for an audio file or for a folder that contains audio files inside::

      python

      phon=Phonet([phonclass])

      get_phon_wav(self, audio_file, feat_file, plot_flag=True)

============= ===========

Parameters     Description

============= ===========

audio_file    file audio (.wav) sampled at 16 kHz

feat_file     file (.csv) to save the posteriors for the phonological classes

phonclass     list of phonological classes to be evaluated

              The list of phonological classes include:

              "consonantal", "back", "anterior", "open", "close", "nasal", "stop",

              "continuant",  "lateral", "flap", "trill", "voice", "strident",

              "labial", "dental", "velar", "pause", "vocalic" or "all"

plot_flag     True or False, whether you want plots of phonological classes or not

returns			  It crates the feat_file with the estimation of the phonological classes for each time-frame of the audio file.

============= ===========

Training

====================================

If you want to train Phonet in your own language, or specific phonological classes that are not defined here, please refer to the folder `train `_ and follow the instructions there.

If you experienced problems with the Training process, please send me an email ``

Reference

==================================

Phonet is available for research purposes

If you use Phonet, please cite the following paper.

@inproceedings{Vasquez-Correa2019,

  author={J. C. Vásquez-Correa and P. Klumpp and J. R. Orozco-Arroyave and E. N\"oth},

  title={{Phonet: A Tool Based on Gated Recurrent Neural Networks to Extract Phonological Posteriors from Speech}},

  year=2019,

  booktitle={Proc. Interspeech 2019},

  pages={549--553},

  doi={10.21437/Interspeech.2019-1405},

  url={http://dx.doi.org/10.21437/Interspeech.2019-1405}

}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jcvasquezc/phonet

Awesome Lists containing this project

README