Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/syoyo/tacotron-tts-cpp
Tacotron text to speech in C++(synthesize only)
https://github.com/syoyo/tacotron-tts-cpp
Last synced: 2 months ago
JSON representation
Tacotron text to speech in C++(synthesize only)
- Host: GitHub
- URL: https://github.com/syoyo/tacotron-tts-cpp
- Owner: syoyo
- License: mit
- Created: 2018-10-04T16:01:11.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-10-17T05:20:01.000Z (about 5 years ago)
- Last Synced: 2023-04-11T17:06:28.730Z (over 1 year ago)
- Language: C++
- Size: 25.3 MB
- Stars: 71
- Watchers: 9
- Forks: 24
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Text-to-speech in (partially) C++ using Tacotron model + Tensorflow
Running Tacotron model in TensorFlow C++ API.
Its good for running TTS in mobile or embedded device.
Code is based on keithito's tacotron implementation: https://github.com/keithito/tacotron
## Status
Experimental.
Python preprocessing is required to generate sequence data from a text.
## Requirment
* TensorFlow r1.8+
* Ubuntu 16.04 or later
* C++ compiler + cmake## Dump graph.
In keithito's tacotron repo, append `tf.train.write_graph` to `Synthesizer::load` to save TensorFlow graph.
```
class Synthesizer:
def load(self, checkpoint_path, model_name='tacotron'):...
# write graph
tf.train.write_graph(self.session.graph.as_graph_def(), "models/", "graph.pb")
```## Freeze graph
Freeze graph for example:
```
freeze_graph \
--input_graph=models/graph.pb \
--input_checkpoint=./tacotron-20180906/model.ckpt \
--output_graph=models/tacotron_frozen.pb \
--output_node_names=model/griffinlim/Squeeze
```Example freeze graph file is included in this repo.
## Build
Edit libtensorflow_cc.so path(Assume you build TensorFlow from source code) in `bootstrap.sh`, then
```
$ ./bootstrap.sh
$ build
$ make
```### Note on libtensorflow_cc
Please make sure building libtensorflow_cc with `--config=monolithic`. Otherwise you'll face undefined symbols error at linking stage.
https://www.tensorflow.org/install/source#preconfigured_configurations
## Run
Prepare sequence JSON file.
Sequence can be generated by using `text_to_sequence()` function in keithito's tacotron repo.See `sample/sequence01.json` for generated example.
Then,
```
$ ./tts -i ../sample/sequence01.json -g ../tacotron_frozen.pb output.wav
```example output01.wav and processed01.wav is included in `sample/`
### Optional parameter
You can specify hyperparameter settings(JSON format) using `-h` option.
See `sample/hparams.json` for example.```
$ ./tts -i ../sample/sequence01.json -h ../sample/hparams.json -g ../tacotron_frozen.pb output.wav
```## Performance
Currently TensorFlow C++ code path only uses single CPU core, so its slow.
Time for synthesis is roughly 10x slower on 2018's CPU than synthesized audio length(e.g. 60 secs for 6 secs audio).## TODO
* Write all TTS pipeline fully in C++
* [ ] Text to sequence(Issue #1)
* [ ] Convert to lower case
* [ ] Expand abbreviation
* [ ] Normalize numbers(number_to_words. python inflect equivalent)
* [ ] Remove extra whitespace
* [ ] Use CPU implementation of Griffin-Lim## License
MIT license.
Pretrained model used for freezing graph is obtained from keithito's repo.
### Third party licenses
- json.hpp : MIT license
- cxxopts.hpp : MIT license
- dr_wav : Public domain