https://github.com/cloudkj/ngram-syllables
Syllable counting and detection using an n-gram language model.
https://github.com/cloudkj/ngram-syllables
clojure language-model lisp ngrams syllable-count
Last synced: about 1 month ago
JSON representation
Syllable counting and detection using an n-gram language model.
- Host: GitHub
- URL: https://github.com/cloudkj/ngram-syllables
- Owner: cloudkj
- License: mit
- Created: 2016-12-10T07:53:41.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-02-18T07:25:56.000Z (over 8 years ago)
- Last Synced: 2025-03-29T03:51:19.015Z (about 2 months ago)
- Topics: clojure, language-model, lisp, ngrams, syllable-count
- Language: Clojure
- Size: 604 KB
- Stars: 7
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ngram-syllables
Syllable counting and detection using an n-gram language model.
## Usage
Training
```
Usage: lein run -m ngram-syllables.train [options] corpus
Options:
-h, --help
-n, --n GRAMS 1 Number of grams
-o, --output FILE target/model.edn Path to desired output location of model
```Predictions
```
Usage: lein run -m ngram-syllables.predict [options] weight_1 ... weight_n
Options:
-d, --delim DELIM Empty space Output syllable delimiter
-h, --help
-m, --model FILE target/model.edn Path to location of model
```## Example
Generate syllable boundaries for some words not in the training corpus.
```
% ./train.sh
Training model with n = 3
17490 1-gram sequences
17489 2-gram sequences
7434 3-gram sequences
Output: target/model.edn
% head -n 20 resources/pokemon_names.txt | ./predict.sh --delim · 0.1 0.1 0.8
bulb·a·saur
i·vy·saur
ven·u·saur
char·man·der
char·mel·e·on
char·i·zard
squirt·le
war·tor·tle
blast·o·ise
ca·ter·pie
met·a·pod
but·ter·free
weed·le
ka·ku·na
bee·drill
pid·gey
pid·ge·ot·to
pid·ge·ot
rat·ta·ta
ra·ti·cate
```## License
Copyright © 2016-2017