https://github.com/bbc/subtitles-generator

A node module to generate subtitles by segmenting a list of time-coded text - BBC News Labs
https://github.com/bbc/subtitles-generator

captions digital-paper-edit itt json news-labs newslabs premiere srt stt subtitles transcript-editor ttml vtt

Last synced: 2 months ago
JSON representation

A node module to generate subtitles by segmenting a list of time-coded text - BBC News Labs

Host: GitHub
URL: https://github.com/bbc/subtitles-generator
Owner: bbc
Created: 2019-05-30T11:30:57.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2023-07-08T13:36:25.000Z (almost 2 years ago)
Last Synced: 2025-03-22T06:51:17.698Z (3 months ago)
Topics: captions, digital-paper-edit, itt, json, news-labs, newslabs, premiere, srt, stt, subtitles, transcript-editor, ttml, vtt
Language: JavaScript
Homepage:
Size: 618 KB
Stars: 49
Watchers: 20
Forks: 7
Open Issues: 9
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Subtitles Generator - draft 

A node module to generate subtitles by segmenting a list of time-coded text.

Exports to 

- [x] TTML for Premiere as `.xml`

- [x] TTML 

- [x] iTT - for Apple 

- [x] srt

- [x] vtt 

- [x] csv 

- [x] txt - pre-segmented text

It can also provide pre-segmented lines if the input is plain text.

## Setup

git clone, cd into folder, `npm install`

## Usage

```js

const subtitlesComposer = require('./src/index.js');

// const sampleWords = // some word json 

const subtitlesJson = subtitlesComposer({words: sampleWords, type: 'json'})

const ttmlPremiere = subtitlesComposer({words: sampleWords, type: 'premiere'})

const ittData = subtitlesComposer({words: sampleWords, type: 'itt'})

const ttmlData = subtitlesComposer({words: sampleWords, type: 'ttml'})

const srtData = subtitlesComposer({words: sampleWords, type: 'srt'})

const vttData = subtitlesComposer({words: sampleWords, type: 'vtt'})

```

see [`example-usage.js`](./example-usage.js) for more comprehensive example.

To try locally

```

npx babel-node example-usage.js

```

### `words` Input 

- either an array list of words objects    

example

```js

const sampleWords =[ 

      {

        "id": 0,

        "start": 13.02,

        "end": 13.17,

        "text": "There"

      },

      {

        "id": 1,

        "start": 13.17,

        "end": 13.38,

        "text": "is"

      },

      {

        "id": 2,

        "start": 13.38,

        "end": 13.44,

        "text": "a"

      },

      {

        "id": 3,

        "start": 13.44,

        "end": 13.86,

        "text": "day."

      },

...

```

- or a string of text     

Example

```js

const sampleWords = "There is a day. ..."

```

If input `words` is plain text only (and not a list of words with timecodes) then can only use `pre-segment-txt` option. (see [`test-presegment.txt`](./example-output/test-presegment.txt) for example)

## Output: 

see [`example-output`](./example-output) folder for examples.

## System Architecture

In pseudo code, at a high level 

```

// expecting array list of words OR plain text string

  // if array list of words, convert text into string

  // presegment the text 

     using pre segmentation algorithm to break into line of x char - default 35

// generate subtitles 

   use subtitles generators for various format to convert presegemented json into subtitles

// return trsult

```

Segmentation algorithm refactored from [`pietrop/subtitlesComposer`](https://github.com/pietrop/subtitlesComposer) originally by [@polizoto](https://github.com/polizoto). 

And subtitles generation in various originally format by [`@laurian`](https://github.com/laurian) and [`@maboa`](https://github.com/maboa)as part of BBC Subtitlelizer project.

## Development env

 

- npm > `6.1.0`

- [Node 10 - dubnium](https://scotch.io/tutorials/whats-new-in-node-10-dubnium)

- [Eslint](https://eslint.org/)

- [Babel](https://babeljs.io/)

Node version is set in node version manager [`.nvmrc`](https://github.com/creationix/nvm#nvmrc)

## Build

```

npm run build

```

uses [babel-cli](https://babeljs.io/docs/en/babel-cli) to transpile ES6 into the `./build` folder.

## Tests

```

npm test

```

To run tests during development

```

npm run test:watch

```

## Linting

To run linter

```

npm run lint

```

To run and fix

```

npm run lint:fix

```

## Deployment

_coming soon, deploying to npm registry as [@bbc/subtitles-composer]()_

```

npm run publish:public

```

---

# TODO

- [ ] Open source

- [x] use import/export in modules 

- [x] add babel

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bbc/subtitles-generator

Awesome Lists containing this project

README