https://github.com/zabir-nabil/bangla-tts

Bangla text to speech, Multilingual (Bangla, English) real-time speech synthesis library
https://github.com/zabir-nabil/bangla-tts

bangla bangla-speech bangla-speech-synthesis bangla-text-to-speech bangla-tts text-to-speech

Last synced: 7 months ago
JSON representation

Bangla text to speech, Multilingual (Bangla, English) real-time speech synthesis library

Host: GitHub
URL: https://github.com/zabir-nabil/bangla-tts
Owner: zabir-nabil
Created: 2019-11-28T18:56:04.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2024-10-17T09:23:26.000Z (about 1 year ago)
Last Synced: 2025-03-30T05:09:34.152Z (7 months ago)
Topics: bangla, bangla-speech, bangla-speech-synthesis, bangla-text-to-speech, bangla-tts, text-to-speech
Language: Python
Homepage: https://github.com/zabir-nabil/bangla-tts
Size: 3.51 MB
Stars: 88
Watchers: 5
Forks: 23
Open Issues: 7
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Byakto TTS [bangla text to speech]

Multilingual (Bangla, English) real-time speech synthesis library.

### Installation

 * Install Anaconda

 * `conda create -n new_virtual_env python==3.6.8`

 * `conda activate new_virtual_env`

 * `pip install -r requirements.txt`

 * While running for the first time, keep your internet connection on to download the weights of the speech synthesis models (> 500 MB)

 * For fast inference, you must install tensorflow-gpu and have an NVIDIA GPU (CUDA).

### Usage

```python

'''

function: generate(text_arr = [""], save_path = None)

arguments: 

text_arr (array) : an array of strings

save_path (string, optional) : location where generated wav files will be stored if save_path is not None, if the path is not valid, the wav files will be saved in current directory

returns:

if save_path is None, instead of saving an array of tuples containing geenrated speech signals and the sampling rate will be returned

if save_path is not None, then a list containing the file paths (relative) will be returned

'''

from bangla_tts import generate

# usage 1 (saving to path)

file_names = generate(["আমার সোনার বাংলা আমি তোমাকে ভালোবাসি"], save_path = "static") # will be saved to static folder

print(file_names)

# usage 2 (getting numpy arrays for the signals)

gen_wavs = generate(["আমার সোনার বাংলা আমি তোমাকে ভালোবাসি"]) # will return an array containing the speech and sampling rate

print(gen_wavs[0])

print(f"signal length: {gen_wavs[0][0].shape}")

print(f"samplign rate: {gen_wavs[0][1]}")

```

### Generated unseen speech samples

[Sample 1 (আমার সোনার বাংলা আমি তোমাকে ভালোবাসি)](static/0_56258.wav)

[Sample 2 (I am still not a great speaker)](static/2_235924.wav)

[Sample 3 (This is just a test)](static/3_256189.wav)

### Update (18th April)

- [x] Synthesize longer sentences

- [x] Phonetic representation for English, Bangla numeric segments

 * Added a simple parser which will translate numeric keys to corresponding phonetic representation.

 Example: *১৯৯৭ সালের ২১ জানুয়ারী তে আমার জন্ম হয়* will be converted to *['ঊনিশশ সাতানব্বই সালের একুশ জানুয়ারী তে আমার জন্ম হয় ']* by the parser.

 * Added a simple batch mechanism for translating longer sentences. As the attention window was fixed during training, the model previously failed to generate long sentences (n_characters > 200). So, added a simple segmenting step to break the sentences into multiple parts, synthesize in batch, and finally merge them into a single audio file.

 **New examples:**

 [১৯৯৭ সালের ২১ জানুয়ারী তে আমার জন্ম হয়](birthdate.wav)

 [আমার ফোন নাম্বার ০১৭১৩৩৫৩৪৩, তবে আমাকে সকাল ১০ টার আগে পাবেন না](phone_number.wav)

 [বাংলাদেশে গত ২৪ ঘণ্টায় ৩০৬ জন কোভিড-১৯ আক্রান্ত হয়েছেন। এই সময়ের মধ্যে মৃত্যু হয়েছে ৯ জনের। এ নিয়ে দেশটিতে মোট আক্রান্ত হলেন ২১৪৪। আর করোনা ভাইরাসে আক্রান্ত হয়ে মৃত্যু হয়েছে ৮৪ জনের। নতুন করে ৮ জনের পরীক্ষা করার পর করোনা ভাইরাসের উপস্থিতি পাওয়া যায়নি। এনিয়ে মোট ৬৬ জন সুস্থ হলেন।](covid19.wav) - BBC Bangla

### Test Dataset

I have designed a test dataset to compare new bangla TTS models with the benchmark model (bangla-tts).

[Bakta dataset](test_dataset/bakta_bang.json)

[Bakta dataset (multi-lingual, bangla + english)](test_dataset/bakta_bang_eng.json)

### Wiki Pages

 * [Sourceforge project page](https://sourceforge.net/projects/bangla-tts/)

 * [Wiki Homepage](https://sourceforge.net/p/bangla-tts/wiki/Home/)

### To-dos

- [ ] PyPI

- [ ] More training

- [ ] Light model

- [ ] Publish the restful API

- [ ] Publish the flask app

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zabir-nabil/bangla-tts

Awesome Lists containing this project

README