https://github.com/bbc/subtitles-generator
A node module to generate subtitles by segmenting a list of time-coded text - BBC News Labs
https://github.com/bbc/subtitles-generator
captions digital-paper-edit itt json news-labs newslabs premiere srt stt subtitles transcript-editor ttml vtt
Last synced: 2 months ago
JSON representation
A node module to generate subtitles by segmenting a list of time-coded text - BBC News Labs
- Host: GitHub
- URL: https://github.com/bbc/subtitles-generator
- Owner: bbc
- Created: 2019-05-30T11:30:57.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2023-07-08T13:36:25.000Z (almost 2 years ago)
- Last Synced: 2025-03-22T06:51:17.698Z (3 months ago)
- Topics: captions, digital-paper-edit, itt, json, news-labs, newslabs, premiere, srt, stt, subtitles, transcript-editor, ttml, vtt
- Language: JavaScript
- Homepage:
- Size: 618 KB
- Stars: 49
- Watchers: 20
- Forks: 7
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Subtitles Generator - draft
A node module to generate subtitles by segmenting a list of time-coded text.
Exports to
- [x] TTML for Premiere as `.xml`
- [x] TTML
- [x] iTT - for Apple
- [x] srt
- [x] vtt
- [x] csv
- [x] txt - pre-segmented textIt can also provide pre-segmented lines if the input is plain text.
## Setup
git clone, cd into folder, `npm install`
## Usage
```js
const subtitlesComposer = require('./src/index.js');
// const sampleWords = // some word json
const subtitlesJson = subtitlesComposer({words: sampleWords, type: 'json'})
const ttmlPremiere = subtitlesComposer({words: sampleWords, type: 'premiere'})
const ittData = subtitlesComposer({words: sampleWords, type: 'itt'})
const ttmlData = subtitlesComposer({words: sampleWords, type: 'ttml'})
const srtData = subtitlesComposer({words: sampleWords, type: 'srt'})
const vttData = subtitlesComposer({words: sampleWords, type: 'vtt'})
```
see [`example-usage.js`](./example-usage.js) for more comprehensive example.To try locally
```
npx babel-node example-usage.js
```### `words` Input
- either an array list of words objects
example
```js
const sampleWords =[
{
"id": 0,
"start": 13.02,
"end": 13.17,
"text": "There"
},
{
"id": 1,
"start": 13.17,
"end": 13.38,
"text": "is"
},
{
"id": 2,
"start": 13.38,
"end": 13.44,
"text": "a"
},
{
"id": 3,
"start": 13.44,
"end": 13.86,
"text": "day."
},
...
```
- or a string of text
Example
```js
const sampleWords = "There is a day. ..."
```If input `words` is plain text only (and not a list of words with timecodes) then can only use `pre-segment-txt` option. (see [`test-presegment.txt`](./example-output/test-presegment.txt) for example)
## Output:
see [`example-output`](./example-output) folder for examples.## System Architecture
In pseudo code, at a high level
```
// expecting array list of words OR plain text string// if array list of words, convert text into string
// presegment the text
using pre segmentation algorithm to break into line of x char - default 35// generate subtitles
use subtitles generators for various format to convert presegemented json into subtitles// return trsult
```Segmentation algorithm refactored from [`pietrop/subtitlesComposer`](https://github.com/pietrop/subtitlesComposer) originally by [@polizoto](https://github.com/polizoto).
And subtitles generation in various originally format by [`@laurian`](https://github.com/laurian) and [`@maboa`](https://github.com/maboa)as part of BBC Subtitlelizer project.## Development env
- npm > `6.1.0`
- [Node 10 - dubnium](https://scotch.io/tutorials/whats-new-in-node-10-dubnium)
- [Eslint](https://eslint.org/)
- [Babel](https://babeljs.io/)Node version is set in node version manager [`.nvmrc`](https://github.com/creationix/nvm#nvmrc)
## Build
```
npm run build
```uses [babel-cli](https://babeljs.io/docs/en/babel-cli) to transpile ES6 into the `./build` folder.
## Tests
```
npm test
```To run tests during development
```
npm run test:watch
```## Linting
To run linter```
npm run lint
```To run and fix
```
npm run lint:fix
```## Deployment
_coming soon, deploying to npm registry as [@bbc/subtitles-composer]()_
```
npm run publish:public
```---
# TODO
- [ ] Open source
- [x] use import/export in modules
- [x] add babel