https://github.com/x-lance/storytts

[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations
https://github.com/x-lance/storytts

Last synced: 8 months ago
JSON representation

[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

Host: GitHub
URL: https://github.com/x-lance/storytts
Owner: X-LANCE
License: other
Created: 2023-09-07T11:04:15.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-04-27T14:49:36.000Z (over 1 year ago)
Last Synced: 2025-01-11T21:11:02.921Z (9 months ago)
Language: HTML
Homepage: https://goarsenal.github.io/StoryTTS/
Size: 25.7 MB
Stars: 137
Watchers: 17
Forks: 4
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # StoryTTS

> [STORYTTS: A HIGHLY EXPRESSIVE TEXT-TO-SPEECH DATASET WITH RICH TEXTUAL EXPRESSIVENESS ANNOTATIONS](https://ieeexplore.ieee.org/document/10446023)

StoryTTS is a highly expressive text-to-speech dataset that contains rich expressiveness both in acoustic and textual perspective, from the recording of a Mandarin storytelling show (评书), which is delivered by a female artist, Lian Liru(连丽如). It contains 61 hours of consecutive and highly prosodic speech equipped with accurate text transcriptions and rich textual expressiveness annotations.

[Demos](https://goarsenal.github.io/StoryTTS/)

## Dataset Statistics





## Download

* Please download the speech data from [Huggingface](https://huggingface.co/datasets/Arsenal/StoryTTS) or [ModelScope](https://modelscope.cn/api/v1/datasets/CantabileKwok/StoryTTS/repo?Revision=master&FilePath=StoryTTS.zip)

  ### Note

  * The dataset is **ONLY** for research purposes.

  * The ownership of the speech data remains with the original owner. Downloading this dataset defaults to agreeing to sign our [licensing agreement](storytts_license_agreement.pdf). lt's important to note that these materials may be removed at any time upon request from the original owner.

## File Description

* `dataset/transcript` : The transcripts of StoryTTS in simplified Chinese with puncuations.

* `dataset/utt2dur`: The duration (in seconds) of each utterance.

* `dataset/utt2spk`: The speaker name of each utterance, i.e. the name of the only speaker in StoryTTS.

* `dataset/label` : The annotation labels of StoryTTS. The format of this file is as follows:

  ```

  utt-ID 句式(Sentence Pattern)|修辞手法(Rhetoric Device)|场景(Scene)|情感色彩(Emotional colors)|模仿人物(Imitated Characters)

  ```

* `dataset/prompt_claude2`: Prompt and instruction for Claude2.

* `dataset/prompt_gpt4`: Prompt and instruction for GPT4.

* `dataset/wav.scp`: Path of wav files. Note: might be changed according to your location of storing the speech data.

## Citation

```

@inproceedings{storytts,

  author={Sen Liu and Yiwei Guo and Xie Chen and Kai Yu},

  title={{StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations}},

  year={2024},

  booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},

  pages={11521-11525},

  doi={10.1109/ICASSP48485.2024.10446023}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/x-lance/storytts

Awesome Lists containing this project

README