https://github.com/flogy/gatsby-mdx-tts

🗣 Adds speech output to your Gatsby site using Amazon Polly.
https://github.com/flogy/gatsby-mdx-tts

a11y aws gatsby gatsby-plugin markdown mdx polly remark text-to-speech tts

Last synced: 2 months ago
JSON representation

🗣 Adds speech output to your Gatsby site using Amazon Polly.

Host: GitHub
URL: https://github.com/flogy/gatsby-mdx-tts
Owner: flogy
License: mit
Created: 2020-02-18T19:03:43.000Z (over 5 years ago)
Default Branch: main
Last Pushed: 2023-04-18T16:56:45.000Z (about 2 years ago)
Last Synced: 2025-04-16T17:55:11.034Z (3 months ago)
Topics: a11y, aws, gatsby, gatsby-plugin, markdown, mdx, polly, remark, text-to-speech, tts
Language: TypeScript
Homepage: https://gatsby-mdx-tts.netlify.com
Size: 1.22 MB
Stars: 9
Watchers: 2
Forks: 2
Open Issues: 14
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

awesome-list - gatsby-mdx-tts

README

![Logo](./img/gatsby-mdx-tts.svg)

> 🗣 Easy **text-to-speech** for your [Gatsby](https://www.gatsbyjs.org/) site, powered by [Amazon Polly](https://aws.amazon.com/de/polly/).

# gatsby-mdx-tts

[![Pull requests are welcome!](https://img.shields.io/badge/PRs-welcome-brightgreen)](#contribute-)
[![npm](https://img.shields.io/npm/v/gatsby-mdx-tts)](https://www.npmjs.com/package/gatsby-mdx-tts)
[![GitHub license](https://img.shields.io/github/license/flogy/gatsby-mdx-tts)](https://github.com/flogy/gatsby-mdx-tts/blob/main/LICENSE)

## Demo

> Check out the [▶️ LIVE DEMO](https://gatsby-mdx-tts.netlify.com)!

![Demo Screencast](./img/demo.gif)

Also check out the [example project repository](https://github.com/flogy/gatsby-mdx-tts-example)!

## Installation

`npm install --save gatsby-mdx-tts`

## How to use

### Prerequisites

1. In order to use this plugin you need an [AWS account](https://portal.aws.amazon.com/billing/signup). You can use the text-to-speech service ([AWS Polly](https://aws.amazon.com/de/polly/)) for free for the first 12 months (up to a couple million words to be precise).

**Attention:** If you exceed the limits or use it after your initial free tier, using this plugin will generate costs in your AWS account! Read how you can [save money by using an external cache](#save-money-using-an-external-cache-).

2. As this is a plugin for [gatsby-plugin-mdx](https://github.com/gatsbyjs/gatsby/tree/master/packages/gatsby-plugin-mdx) it will only work if you have that plugin installed and configured properly as well.

### Mandatory configurations

#### gatsby-config.js

To include the plugin just add it to your `gatsby-plugin-mdx` configuration in the `gatsbyRemarkPlugin` section. In case you have multiple `gatsbyRemarkPlugins` configured is very important that you put the `gatsby-mdx-tts` plugin to **first position**!

Also, you need to include a couple of mandatory configurations:

```javascript
// In your gatsby-config.js
plugins: [
{
resolve: `gatsby-plugin-mdx`,
options: {
gatsbyRemarkPlugins: [
{
resolve: "gatsby-mdx-tts",
options: {
awsRegion: "us-east-1",
defaultVoiceId: "Justin",
},
},
],
},
},
],
```

#### AWS credentials

The plugin requires your AWS credentials in order to generate the text-to-speech files.

**Important:** For security reasons it is not a good idea to keep access keys with administrator permissions on your local machine, without at least using MFA authentication. Even better is to restrict the AWS user's permissions to `AmazonPollyReadOnlyAccess`, which is all this plugin needs.

There are various ways to provide your AWS credentials to the plugin. For example:

- [Create a shared credentials file](https://docs.aws.amazon.com/ses/latest/DeveloperGuide/create-shared-credentials-file.html) and add a profile for your AWS user that will use AWS Polly. You can either configure it as your default profile or use the `awsProfile` plugin option or `AWS_PROFILE` environment variable to pass the custom profile name to the plugin.

```javascript
// In your gatsby-config.js
{
resolve: "gatsby-mdx-tts",
options: {
awsProfile: "gatsby-mdx-tts",
},
},
```

- Use environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` to directly configure your user's access key (e.g. to build in a CI environment).

### All configurations

| Option | Required | Example |
| ------------------------------------------ | -------- | ----------------------------------------------------- |
| `awsRegion` | Yes | `"us-east-1"` |
| `defaultVoiceId` | Yes | `"Justin"` |
| `awsProfile` | No | `"gatsby-mdx-tts"` |
| `defaultSsmlTags` | No | `"$SPEECH_OUTPUT_TEXT"` |
| `defaultLexiconNames` | No | `["LexA", "LexB"]` |
| `ignoredCharactersRegex` | No | `/·/` |
| `speechOutputComponentNames` | No | `["CustomComponent"]` |
| `skipRegeneratingIfExistingInPublicFolder` | No | `true` |

##### About `defaultSsmlTags`:

- For an overview of all supported SSML tags check out the [supported SSML tags list](https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html) in the AWS docs.
- The surrounding `` tag is added automatically.
- The variable `$SPEECH_OUTPUT_TEXT` will be replaced with the speech output text.

##### About `ignoredCharactersRegex`:

If your text contains special characters that should not be vocalized (e.g. `fear·ful` should be just read as `fearful`) you can use the `ignoredCharactersRegex` to define the characters to be ignored.

You might also want those words not to be split up during word marking. Therefore also check out [Ignore word splitting characters](#ignore-word-splitting-characters).

##### About `speechOutputComponentNames`:

If you want to use your own component to handle the generated speech output you can specify its name using the `speechOutputComponentNames` option. The plugin will then use this instead of `SpeechOutput` to extract the text to be used for TTS generation. It is also possible to define multiple component names. Like that you can customize the way speech output is handled. Find more information about this in the [customization chapter](#customize).

#### About `skipRegeneratingIfExistingInPublicFolder`:

Regenerating all the files for every cache invalidation can quickly get costly. To avoid regenerating files that are already existing in the public folder, activate this flag. **Be aware that this will lead to outdated files if you change the MDX files!**

### Embed speech output in your MDX

After configuring the plugin you can now add the `` component to your MDX files. The surrounded content will then be playable. You can add multiple speech output blocks to your content, but make sure the `id` is always set to an **unique value over all occurrences**. Also, it is important that there is an empty line between the `SpeechOutput` tags and the content to get it working.

```markdown
import SpeechOutput from "gatsby-mdx-tts/SpeechOutput.js"

This text will be outside the speech output.

But this text will be playable. Please consider that:

- The play button is added automatically.
- The words in this text are marked one by one during text output.

```

## Customize

### Define individual speech output parameters

To define speech output parameters for individual `` components you can pass them as props. This will override the eventually configured default parameters.

| Prop | Required | Example |
| -------------- | -------- | ----------------------------------------------------- |
| `id` | Yes | `"my-individual-speech-output"` |
| `lexiconNames` | No | `['LexA', 'LexB']` |
| `ssmlTags` | No | `"$SPEECH_OUTPUT_TEXT"` |
| `voiceId` | No | `"Hans"` |

#### Example

```markdown

```

As you can see the order of the props does not matter. However, it is important to pass the props in the correct types (e.g. do not pass the `lexiconNames` as a string but as an array).

### Play button

To customize the play button you can use the optional `SpeechOutput` component prop `customPlayButton`. Just pass in your custom play button component.

If you choose to use a custom play button component, make sure it uses the `PlayButtonProps` exported from this plugin.

### Speech output handling

You can replace the whole speech output handling by using your own React component instead of the default `SpeechOutput` component. Like that, the TTS files are still generated during build phase but you can then do whatever you want with those files inside your component at runtime. To do so, use the `speechOutputComponentNames` configuration option (see [About `speechOutputComponentNames`](#about-speechoutputcomponentnames)).

If you choose to use your own component, make sure it uses the `SpeechOutputProps` exported from this plugin.

### Custom `useSound` hook

In case you would like to manage playing sounds by yourself you can pass an optional hook to the `useCustomSoundHook` prop of the `SpeechOutput` component. It has to follow the `UseSoundHookSignature` type as exported from `UseSound.ts` (which is the default sound hook).

### Ignore word splitting characters

You might use characters that split a word into two, e.g. `fear·ful`. Those word parts are now marked individually by default. To avoid this, you can define the characters to ignore with the `ignoredWordSplittingCharactersRegex` prop.

Probably you also don't want this character to be vocalized during speech output. Therefore make sure you also configure the `ignoredCharactersRegex` in the [plugin options](#all-configurations).

## Event listeners

To be able to react to certain events you can register the following event listeners:

### `onWordMarked`

When a speech output is played the spoken words are highlighted in the text simultaneously. The `onWordMarked` listener is called as soon as a new word is highlighted and delivers the currently highlighted word as a string. When no word is highlighted (anymore) the string is empty.

## Save money using an external cache 💸

Every time the internal Gatsby cache is cleared and your TTS files are regenerated AWS will bill you for it after exceeding the free tier. This can get quite expensive, especially for large projects.

To prevent this from happening too often you can use external caches. Here is a list of plugins you can use for this purpose:

- https://github.com/axe312ger/gatsby-plugin-netlify-cache
- https://github.com/axe312ger/gatsby-plugin-sftp-cache

## Contribute 🦸

Contributions are more than welcome! I would love to see text-to-speech becoming a thing in the already very accessible Gatsby ecosystem. If you agree with this and would like to join me on this mission it would be awesome to get in touch! 😊

Please feel free to create, comment and of course solve some of the issues. To get started you can also go for the easier issues marked with the `good first issue` label if you like.

## License

The [MIT License](LICENSE)

## Credits

The _gatsby-mdx-tts_ library is maintained and sponsored by the Swiss web and mobile app development company [Florian Gyger Software](https://floriangyger.ch).

If this library saved you some time and money please consider [sponsoring me](https://github.com/sponsors/flogy), so I can build more libraries for free and actively maintain them for you. Thank you 🙏

### Current sponsors

A big thank you goes to the current sponsors of this library:

Andrin Meier

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/flogy/gatsby-mdx-tts

Awesome Lists containing this project

README