https://github.com/compulim/react-dictate-button
A button to start dictation using Web Speech API.
https://github.com/compulim/react-dictate-button
Last synced: about 1 month ago
JSON representation
A button to start dictation using Web Speech API.
- Host: GitHub
- URL: https://github.com/compulim/react-dictate-button
- Owner: compulim
- License: mit
- Created: 2018-06-24T22:38:53.000Z (almost 7 years ago)
- Default Branch: main
- Last Pushed: 2025-02-13T08:46:22.000Z (3 months ago)
- Last Synced: 2025-04-02T10:47:00.070Z (about 2 months ago)
- Language: TypeScript
- Homepage: https://compulim.github.io/react-dictate-button/
- Size: 1.81 MB
- Stars: 26
- Watchers: 1
- Forks: 2
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# react-dictate-button
[](https://badge.fury.io/js/react-dictate-button) [](https://travis-ci.org/compulim/react-dictate-button)
A button to start speech recognition using [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API), with an easy to understand event lifecycle.
# Breaking changes
## [2.0.0] - 2021-05-15
- Requires [`react@>=16.8.0`](https://npmjs.com/package/react) and [`core-js@3`](https://npmjs.com/package/core-js`)
- Modifying props while recognition has started will no longer abort recognition immediately, props will be updated in next recognition
- `SpeechGrammarList` is only constructed when `grammar` props is present
- If `speechRecognition` prop is not present, capability detection is now done through `window.mediaDevices.getUserMedia`# Demo
Try out this component at [github.io/compulim/react-dictate-button](https://github.io/compulim/react-dictate-button/).
# Background
Reasons why we need to build our own component, instead of using [existing packages](https://www.npmjs.com/search?q=react%20speech) on NPM:
- Most browsers required speech recognition (or WebRTC) to be triggered by a user event (button click)
- Bring your own engine for [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API)
- Enable speech recognition on unsupported browsers by bridging it with [cloud-based service](https://npmjs.com/package/web-speech-cognitive-services)
- Support grammar list thru [JSpeech Grammar Format](https://www.w3.org/TR/jsgf/)
- Ability to interrupt recognition
- Ability to [morph into other elements](#customization-thru-morphing)# How to use
First, install our production version by `npm install react-dictate-button`. Or our development version by `npm install react-dictate-button@master`.
```jsx
import { DictateButton } from 'react-dictate-button';export default () => (
Start/stop
);
```## Props
| Name | Type | Default | Description |
| ------------------- | ------------------------ | ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `className` | `string` | `undefined` | Class name to apply to the button |
| `continuous` | `boolean` | `false` | `true` to set Web Speech API to use continuous mode and should continue to recognize until stop, otherwise, `false` |
| `disabled` | `boolean` | `false` | `true` to abort ongoing recognition and disable the button, otherwise, `false` |
| `extra` | `{ [key: string]: any }` | `{}` | Additional properties to set to [`SpeechRecognition`](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition) before `start`, useful when bringing your own [`SpeechRecognition`](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition) |
| `grammar` | `string` | `undefined` | Grammar list in [JSGF format](https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammarList/addFromString) |
| `lang` | `string` | `undefined` | Language to recognize, for example, `'en-US'` or [`navigator.language`](https://developer.mozilla.org/en-US/docs/Web/API/NavigatorLanguage/language) |
| `speechGrammarList` | `any` | `window.SpeechGrammarList` (or vendor-prefixed) | Bring your own [`SpeechGrammarList`](https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammarList) |
| `speechRecognition` | `any` | `window.SpeechRecognition` (or vendor-prefixed) | Bring your own [`SpeechRecognition`](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition) |> Note: change of `extra`, `grammar`, `lang`, `speechGrammarList`, and `speechRecognition` will not take effect until next speech recognition is started.
## Events
Name
Signature
Description
onClick
(event: MouseEvent) => void
Emit when the user click on the button,preventDefault
will stop recognition from starting
onDictate
({
result: {
confidence: number,
transcript: number
},
type: 'dictate'
}) => void
Emit when recognition is completed
onError
(event: SpeechRecognitionErrorEvent) => void
Emit when error has occurred or recognition is interrupted, see below
onProgress
({
abortable: boolean,
results: [{
confidence: number,
transcript: number
}],
type: 'progress'
}) => void
Emit for interim results, the array contains every segments of recognized text
onRawEvent
(event: SpeechRecognitionEvent) => void
Emit for handling raw events from
SpeechRecognition
## Hooks
> Although previous versions exported a React Context, it is recommended to use the hooks interface.
| Name | Signature | Description |
| --------------- | ----------- | --------------------------------------------------------------------------------------------------- |
| `useAbortable` | `[boolean]` | If ongoing speech recognition has `abort()` function and can be aborted, `true`, otherwise, `false` |
| `useReadyState` | `[number]` | Returns the current state of recognition, refer to [this section](#function-as-a-child) |
| `useSupported` | `[boolean]` | If speech recognition is supported, `true`, otherwise, `false` |### Checks if speech recognition is supported
To determines whether speech recognition is supported in the browser:
- If `speechRecognition` prop is `undefined`
- If both [`window.navigator.mediaDevices`](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices) and [`window.navigator.mediaDevices.getUserMedia`](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia) are falsy, it is not supported
- Probably the browser is not on a secure HTTP connection
- If both `window.SpeechRecognition` and vendor-prefixed are falsy, it is not supported
- If recognition failed once with `not-allowed` error code, it is not supported
- Otherwise, it is supported> Even the browser is on an insecure HTTP connection, `window.SpeechRecognition` (or vendor-prefixed) will continue to be truthy. Instead, `mediaDevices.getUserMedia` is used for capability detection.
### Event lifecycle
One of the design aspect is to make sure events are easy to understand and deterministic. First rule of thumb is to make sure `onProgress` will lead to either `onDictate` or `onError`. Here are some samples of event firing sequence (tested on Chrome 67):
- Happy path: speech is recognized
1. `onStart`
1. `onProgress({})` (just started, therefore, no `results`)
1. `onProgress({ results: [] })`
1. `onDictate({ result: ... })`
1. `onEnd`
- Happy path: speech is recognized with continuous mode
1. `onStart`
1. `onProgress({})` (just started, therefore, no `results`)
1. `onProgress({ results: [] })`
1. `onDictate({ result: ... })`
1. `onProgress({ results: [] })`
1. `onDictate({ result: ... })`
1. `onEnd`
- Heard some sound, but nothing can be recognized
1. `onStart`
1. `onProgress({})`
1. `onDictate({})` (nothing is recognized, therefore, no `result`)
1. `onEnd`
- Nothing is heard (audio device available but muted)
1. `onStart`
1. `onProgress({})`
1. `onError({ error: 'no-speech' })`
1. `onEnd`
- Recognition aborted
1. `onStart`
1. `onProgress({})`
1. `onProgress({ results: [] })`
1. While speech is getting recognized, set `props.disabled` to `false`, abort recognition
1. `onError({ error: 'aborted' })`
1. `onEnd`
- Not authorized to use speech or no audio device is availablabortable: truee
1. `onStart`
1. `onError({ error: 'not-allowed' })`
1. `onEnd`## Function as a child
Instead of passing child elements, you can pass a function to render different content based on ready state. This is called [function as a child](https://reactjs.org/docs/render-props.html#using-props-other-than-render).
| Ready state | Description |
| ----------- | -------------------------------------------------------------------------- |
| `0` | Not started |
| `1` | Starting recognition engine, recognition is not ready until it turn to `2` |
| `2` | Recognizing |
| `3` | Stopping |For example,
```jsx
{({ readyState }) =>
readyState === 0 ? 'Start' : readyState === 1 ? 'Starting...' : readyState === 2 ? 'Listening...' : 'Stopping...'
}```
# Customization thru morphing
You can build your own component by copying our layout code, without messing around the [logic code behind the scene](packages/component/src/Composer.js). For details, please refer to [`DictateButton.js`](packages/component/src/DictateButton.js), [`DictateCheckbox.js`](packages/component/src/DictateCheckbox.js), and [`DictationTextBox.js`](packages/pages/src/DictationTextBox.js).
## Checkbox version
In addition to ``, we also ship `` out of the box. The checkbox version is better suited for toggle button scenario and web accessibility. You can use the following code for the checkbox version.
```jsx
import { DictateCheckbox } from 'react-dictate-button';export default () => (
Start/stop
);
```## Text box with dictate button
We also provide a "text box with dictate button" version. But instead of shipping a full-fledged control, we make it a minimally-styled control so you can start copying the code and customize it in your own project. The sample code can be found at [DictationTextBox.js](packages/pages/src/DictationTextBox.js).
# Design considerations
- Hide the complexity of Web Speech events because we only want to focus on recognition experience
- Complexity in lifecycle events: `onstart`, `onaudiostart`, `onsoundstart`, `onspeechstart`
- `onresult` may not fire in some cases, `onnomatch` is not fired in Chrome
- To reduce complexity, we want to make sure event firing are either:
- Happy path: `onProgress`, then either `onDictate` or `onError`
- Otherwise: `onError`
- "Web Speech" could means speech synthesis, which is out of scope for this package
- "Speech Recognition" could means we will expose Web Speech API as-is, which we want to hide details and make it straightforward for recognition scenario# Roadmap
Please feel free to [file](https://github.com/compulim/react-dictate-button/issues) suggestions.
- While `readyState` is 1 or 3 (transitioning), the underlying speech engine cannot be started/stopped until the state transition is complete
- Need rework on the state management
- Instead of putting all logic inside [`Composer.js`](packages/component/src/Composer.js), how about
1. Write an adapter to convert `SpeechRecognition` into another object with simpler event model and `readyState`
2. Rewrite `Composer.js` to bridge the new `SimpleSpeechRecognition` model and React Context
3. Expose `SimpleSpeechRecognition` so people not on React can still benefit from the simpler event model# Contributions
Like us? [Star](https://github.com/compulim/react-dictate-button/stargazers) us.
Want to make it better? [File](https://github.com/compulim/react-dictate-button/issues) us an issue.
Don't like something you see? [Submit](https://github.com/compulim/react-dictate-button/pulls) a pull request.