https://github.com/catalystcode/autocustomvocab
Automatically Generate Custom Vocabulary List for Microsoft Cognitive Services Custom Speech Service
https://github.com/catalystcode/autocustomvocab
Last synced: 6 months ago
JSON representation
Automatically Generate Custom Vocabulary List for Microsoft Cognitive Services Custom Speech Service
- Host: GitHub
- URL: https://github.com/catalystcode/autocustomvocab
- Owner: CatalystCode
- License: apache-2.0
- Created: 2017-04-03T19:02:01.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2017-04-04T17:18:49.000Z (almost 9 years ago)
- Last Synced: 2025-01-22T15:47:59.885Z (about 1 year ago)
- Language: R
- Homepage:
- Size: 8.79 KB
- Stars: 0
- Watchers: 5
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Automatically Generate Custom Vocabulary List for Microsoft Cognitive Services Custom Speech Service
Automatically generate a custom vocabulary list for Cognitive Services Custom Speech Service from your text corpus.
This R script allows you to quickly parse your text file of target text into single words and ngrams, compare these to lists of common words or ngrams, and generate a list of less frequent words and ngrams. You can review these less frequent words and ngrams, and generate your custom vocabulary list for training the Custom Speech Service