{"id":8034458,"url":"https://github.com/vidstack/captions","last_synced_at":"2025-10-05T00:26:18.789Z","repository":{"id":148790519,"uuid":"619134008","full_name":"vidstack/captions","owner":"vidstack","description":"Modern media captions parser and renderer (~5kB). Supports VTT, SRT, and SSA. Works server side, supports text streams, rollup captions via VTT regions, customization via CSS, and more.","archived":false,"fork":false,"pushed_at":"2024-07-29T02:16:53.000Z","size":774,"stargazers_count":130,"open_issues_count":3,"forks_count":15,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-09-17T08:55:31.825Z","etag":null,"topics":["captions","javascript","parser","srt","ssa","ssr","subtitles","typescript","vtt","webvtt"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vidstack.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-26T11:24:46.000Z","updated_at":"2025-09-14T02:56:14.000Z","dependencies_parsed_at":"2024-04-16T08:41:47.454Z","dependency_job_id":"d9bb7639-132e-4c25-9008-5c10a8940e55","html_url":"https://github.com/vidstack/captions","commit_stats":null,"previous_names":["vidstack/captions","vidstack/media-captions"],"tags_count":22,"template":false,"template_full_name":null,"purl":"pkg:github/vidstack/captions","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vidstack%2Fcaptions","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vidstack%2Fcaptions/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vidstack%2Fcaptions/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vidstack%2Fcaptions/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vidstack","download_url":"https://codeload.github.com/vidstack/captions/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vidstack%2Fcaptions/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278395198,"owners_count":25979684,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-04T02:00:05.491Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["captions","javascript","parser","srt","ssa","ssr","subtitles","typescript","vtt","webvtt"],"created_at":"2024-04-16T08:05:54.227Z","updated_at":"2025-10-05T00:26:18.758Z","avatar_url":"https://github.com/vidstack.png","language":"TypeScript","funding_links":[],"categories":["HarmonyOS","TypeScript"],"sub_categories":["Windows Manager"],"readme":"# Media Captions\n\n[![package-badge]][package]\n[![discord-badge]][discord]\n\nCaptions parsing and rendering library built for the modern web.\n\n**Features**\n\n- 🚯 0 dependencies.\n- 💪 Built with TypeScript (TS 5 bundle mode ready).\n- 🪶 5kB total + modular (parser/renderer split) + tree-shaking support.\n- 💤 Parsers are lazy loaded on-demand.\n- 🚄 Efficiently load and apply styles in parallel via CSS files.\n- 🗂️ Supports VTT, SRT, and SSA/ASS.\n- ⬆️ Roll-up captions via VTT regions.\n- 🧰 Modern `fetch` and `ReadableStream` APIs.\n- 📡 Chunked text and response streaming support.\n- 📝 WebVTT spec-compliant settings and rendering.\n- 🎤 Timed text-tracks for karaoke-style captions.\n- 🛠️ Supports custom captions parser and cue renderer.\n- 💥 Collision detection to avoid overlapping or out-of-bounds cues.\n- 🏗️ Fixed and in-order cue rendering (including on font or overlay size changes).\n- 🛑 Adjustable parsing error-tolerance with strict and non-strict modes.\n- 🖥️ Works in the browser and server-side (string renderer).\n- 🎨 Easy customization via CSS.\n\n➕ Planning to also add a TTML, CEA-608, and CEA-708 parser that will map to VTT and render\ncorrectly. In addition, custom font loading and text codes support is planned for SSA/ASS captions.\nWe don't have an exact date but most likely after the [Vidstack Player][vidstack-player] 1.0. If\nurgent and you're willing to sponsor, feel free to email me at rahim.alwer@gmail.com.\n\n🔗 **Quicklinks**\n\n- **[Installation](#installation)**\n- **[Demo](#demo)**\n- **[Motivation](#motivation)**\n- **[API](#api)**\n\n## Demo\n\nThe StackBlitz link below showcases a simple example of how captions are fetched, parsed, and\nrendered over a native video element. We plan on adding more examples for additional captions\nformats and scenarios.\n\n[![Open in StackBlitz](https://developer.stackblitz.com/img/open_in_stackblitz.svg)][stackblitz-demo]\n\n## Motivation\n\n❓ **Are native captions not good enough?**\n\nSimply put, no.\n\n- Positioning, styling, and rendering of cues is interpreted differently across browsers (i.e.,\n  not consistent).\n- Styling customization via pseudo `::cue` selector is inconsistent across browsers and severely\n  limited with respect to even basic movement and styles.\n- Cues can not be easily or accurately moved which means they'll become hidden when custom controls\n  are active.\n- Multiple active cues are not rendered in the correct order. This can also occur randomly on font\n  and overlay size changes (e.g., entering fullscreen).\n- Failure in positioning and customizing styles correctly results in failing accessibility\n  guidelines.\n- Text tracks cannot be removed in most browsers as there's no native API to do so. Captions need\n  to be managed externally and mapped to a generic text track.\n- Karaoke-style captions are not supported out of the box in most browsers.\n- Only VTT is natively supported by browsers.\n- VTT Regions and roll-up captions are not fully supported in all browsers.\n- Custom rendering of cues is not supported.\n- Large caption files can not be streamed and aborted when no longer required.\n- Obviously can not be used server-side.\n\nDid you know closed-captions are governed by the Federal Communications Commission (FCC) under the\nCommunications and Video Accessibility Act (CVA)? Not providing captions and adequate\ncustomization options on the web for content that was shown on TV doesn't meet guidelines 😱 Filed\nlaw suits have dramatically increased in recent years! See the amazing\n[Caption Me If You Can][caption-me-talk] talk at [Demuxed][demuxed] by Dan Sparacio to learn more.\n\n❓ **What about [mozilla/vtt][mozilla-vtt]?**\n\nThe library is old, outdated, and unmaintained.\n\n- Not packaged correctly by modern standards using Node exports with ES6, TS types, CJS/ESM,\n  and server bundles.\n- Doesn't lazy load the parser.\n- Doesn't cleanly work server-side out of the box.\n- Doesn't split parser and renderer so they can be imported separately when needed.\n- Not built with TypeScript so no types are shipped.\n- Doesn't support a wide variety features that we support. Including streaming via modern APIs,\n  VTT regions, roll-up captions, custom renderers, alternative captions formats (SRT/SSA),\n  timed-text and more.\n- In-lines all styles which means they can't be loaded in parallel with JS, makes it harder to\n  customize, and slower with respect to DOM updates.\n- Doesn't include flexible error tolerance settings.\n- Doesn't expose styling attrs for selecting nodes such as cues, voice nodes, and timed-text nodes.\n\n## Installation\n\nFirst, install the NPM package:\n\n```bash\nnpm i media-captions\n```\n\nNext, include styles if you plan on rendering captions using the [`CaptionsRenderer`](#captionsrenderer):\n\n```js\nimport 'media-captions/styles/captions.css';\n// Optional - include if rendering VTT regions.\nimport 'media-captions/styles/regions.css';\n```\n\nOptionally, you can load the styles directly from a CDN using [JSDelivr](https://www.jsdelivr.com)\nlike so:\n\n```html\n\u003clink rel=\"stylesheet\" href=\"https://cdn.jsdelivr.net/npm/media-captions/styles/captions.min.css\" /\u003e\n\u003c!-- Optional - include if rendering VTT regions. --\u003e\n\u003clink rel=\"stylesheet\" href=\"https://cdn.jsdelivr.net/npm/media-captions/styles/regions.min.css\" /\u003e\n```\n\n## API\n\n- **Parsing**\n  - [Parse Options](#parse-options)\n  - [Parse Result](#parse-result)\n  - [Parse Errors](#parse-errors)\n  - [`parseText`](#parsetext)\n  - [`parseTextStream`](#parsetextstream)\n  - [`parseResponse`](#parseresponse)\n  - [`parseByteStream`](#parsebytestream)\n  - [`CaptionsParser`](#captionsparser)\n- **Rendering**\n  - [`createVTTCueTemplate`](#createvttcuetemplate)\n  - [`renderVTTCueString`](#rendervttcuestring)\n  - [`tokenizeVTTCue`](#tokenizevttcue)\n  - [`renderVTTTokensString`](#rendervtttokensstring)\n  - [`updateTimedVTTCueNodes`](#updatetimedvttcuenodes)\n  - [`CaptionsRenderer`](#captionsrenderer)\n  - [Styling](#styling)\n- **Formats**\n  - [VTT](#vtt)\n  - [SRT](#srt)\n  - [SSA/ASS](#ssaass)\n- [Streaming](#streaming)\n- [Types](#types)\n\n## Parse Options\n\nAll parsing functions exported from this package accept the following options:\n\n- `strict`: Whether strict mode is enabled. In strict mode parsing errors will throw and cancel\n  the parsing process.\n- `errors`: Whether errors should be collected and reported in the final\n  [parser result](#parse-result). By default, this value will be true in dev mode or if `strict`\n  mode is true. If set to true and `strict` mode is false, the `onError` callback will be invoked.\n  Do note, setting this to true will dynamically load error builders which will slightly increase\n  bundle size (~1kB).\n- `type`: The type of the captions file format so the correct parser is loaded. Options\n  include `vtt`, `srt`, `ssa`, `ass`, or a custom [`CaptionsParser`](#captionsparser) object.\n- `onHeaderMetadata`: Callback that is invoked when the metadata from the header block has been\n  parsed.\n- `onCue`: Invoked when parsing a VTT cue block has finished parsing and a `VTTCue` has\n  been created. Do note, regardless of which captions file format is provided a `VTTCue` will\n  be created.\n- `onRegion`: Invoked when parsing a VTT region block has finished and a `VTTRegion` has been\n  created.\n- `onError`: Invoked when a loading or parser error is encountered. Do note, this is only invoked\n  in development, if the `strict` parsing option is true, or if the `errors` parsing option is\n  true.\n\nOptions can be provided to any parsing function like so:\n\n```ts\nimport { parseText } from 'media-captions';\n\nparseText('...', {\n  strict: false,\n  type: 'vtt',\n  onCue(cue) {\n    // ...\n  },\n  onError(error) {\n    // ...\n  },\n});\n```\n\n## Parse Result\n\nAll parsing functions exported from this package return a `Promise` which will resolve a\n`ParsedCaptionsResult` object with the following properties:\n\n- `metadata`: An object containing all metadata that was parsed from the header block.\n- `regions`: An array containing `VTTRegion` objects that were parsed and created during the\n  parsing process.\n- `cues`: An array containing `VTTCue` objects that were parsed and created during the parsing\n  process.\n- `errors`: An array containing `ParseError` objects. Do note, errors will only be collected if\n  in development mode, if `strict` parsing option is set to true, or the `errors` parsing option is\n  set to true.\n\n```ts\nimport { parseText } from 'media-captions';\n\n// `ParsedCaptionsResult`\nconst { metadata, regions, cues, errors } = await parseText('...');\n\nfor (const cue of cues) {\n  // ...\n}\n```\n\n## Parse Errors\n\nBy default, parsing is error tolerant and will always try to recover. You can set strict mode\nto ensure errors are not tolerated and are instead thrown. The text stream and parsing process\nwill also be cancelled.\n\n```ts\nimport { parseText, type ParseError } from 'media-captions';\n\ntry {\n  // Any error will now throw and cancel parsing.\n  await parseText('...', { strict: true });\n} catch (error: ParseError) {\n  console.log(error.code, error.message, error.line);\n}\n```\n\nA more tolerant error collection option is to set the `errors` parsing option to true. This\nwill ensure the `onError` callback is invoked and also errors are reported in the final\nresult (this will add ~1kB to the bundle size):\n\n```ts\nimport { parseText } from 'media-captions';\n\nconst { errors } = await parseText('...', {\n  errors: true, // Not required if you only want errors in dev mode.\n  onError(error) {\n    error; // `ParseError`\n  },\n});\n\nfor (const error of errors) {\n  // ...\n}\n```\n\nThe `ParseError` contains a numeric error `code` that matches the following values:\n\n```ts\nconst ParseErrorCode = {\n  LoadFail: 0,\n  BadSignature: 1,\n  BadTimestamp: 2,\n  BadSettingValue: 3,\n  BadFormat: 4,\n  UnknownSetting: 5,\n};\n```\n\nThe `ParseErrorCode` object can be imported from the package.\n\n## `parseText`\n\nThis function accepts a text string as input to be parsed:\n\n```ts\nimport { parseText } from 'media-captions';\n\nconst { cues } = await parseText('...');\n```\n\n## `parseTextStream`\n\nThis function accepts a text stream [`ReadableStream\u003cstring\u003e`](https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream) as input to be parsed:\n\n```ts\nimport { parseTextStream } from 'media-captions';\n\nconst stream = new ReadableStream\u003cstring\u003e({\n  start(controller) {\n    controller.enqueue('...');\n    controller.enqueue('...');\n    controller.enqueue('...');\n    // ...\n    controller.close();\n  },\n});\n\n// `ParsedCaptionsResult`\nconst result = await parseTextStream(stream, {\n  onCue(cue) {\n    // ...\n  },\n});\n```\n\n## `parseResponse`\n\nThe `parseResponse` function accepts a [`Response`](https://developer.mozilla.org/en-US/docs/Web/API/Response) or `Promise\u003cResponse\u003e` object. It can be seamlessly used with\n[`fetch`](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API) to parse a response body stream like so:\n\n```ts\nimport { ParseErrorCode, parseResponse } from 'media-captions';\n\n// `ParsedCaptionsResult`\nconst result = await parseResponse(fetch('/media/subs/english.vtt'), {\n  onCue(cue) {\n    // ...\n  },\n  onError(error) {\n    if (error.code === ParseErrorCode.LoadFail) {\n      console.log(error.message);\n    }\n  },\n});\n```\n\nThe captions type will inferred from the response header `content-type` field. You can specify\nthe specific captions format like so:\n\n```ts\nparseResponse(..., { type: 'vtt' });\n```\n\nThe text encoding will be inferred from the response header and forwarded to the underlying\n[`TextDecoder`](https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/TextDecoder). You\ncan specify a specific encoding like so:\n\n```ts\nparseResponse(..., { encoding: 'utf8' });\n```\n\n## `parseByteStream`\n\nThis function is used to parse byte streams `ReadableStream\u003cUint8Array\u003e`. It's used by the\n`parseResponse` function to parse response body streams. It can be used like so:\n\n```ts\nimport { parseByteStream } from 'media-captions';\n\nconst byteStream = new ReadableStream\u003cUint8Array\u003e({\n  // ...\n});\n\nconst result = await parseByteStream(byteStream, {\n  encoding: 'utf8',\n  onCue(cue) {\n    // ...\n  },\n});\n```\n\n## `CaptionsParser`\n\nYou can create a custom caption parser and provide it to the `type` option on any parse function.\nThe parser can be created and provided like so:\n\n```ts\nimport {\n  type CaptionsParser,\n  type CaptionsParserInit,\n  type ParsedCaptionsResult,\n} from 'media-captions';\n\nclass CustomCaptionsParser implements CaptionsParser {\n  /**\n   * Called when initializing the parser before the\n   * parsing process begins.\n   */\n  init(init: CaptionsParserInit): void | Promise\u003cvoid\u003e {\n    // ...\n  }\n  /**\n   * Called when a new line of text has been read and\n   * requires parsing. This includes empty lines which\n   * can be used to separate caption blocks.\n   */\n  parse(line: string, lineCount: number): void {\n    // ...\n  }\n  /**\n   * Called when parsing has been cancelled, or has\n   * naturally ended as there are no more lines of\n   * text to be parsed.\n   */\n  done(cancelled: boolean): ParsedCaptionsResult {\n    // ...\n  }\n}\n\n// Custom parser can be provided to any parse function.\nparseText('...', {\n  type: () =\u003e new CustomCaptionsParser(),\n});\n```\n\n## `createVTTCueTemplate`\n\nThis function takes a `VTTCue` and renders the cue text string into a HTML template element\nand returns a `VTTCueTemplate`. The template can be used to efficiently store and clone\nthe rendered cue HTML like so:\n\n```ts\nimport { createVTTCueTemplate, VTTCue } from 'media-captions';\n\nconst cue = new VTTCue(0, 10, '\u003cv Joe\u003eHello world!');\nconst template = createVTTCueTemplate(cue);\n\ntemplate.cue; // original `VTTCue`\ntemplate.content; // `DocumentFragment`\n\n// \u003cspan title=\"Joe\" data-part=\"voice\"\u003eHello world!\u003c/span\u003e\nconst cueHTML = template.content.cloneNode(true);\n```\n\n## `renderVTTCueString`\n\nThis function takes a `VTTCue` and renders the cue text string into a HTML string. This\nfunction can be used server-side to render cue content like so:\n\n```ts\nimport { renderVTTCueString, VTTCue } from 'media-captions';\n\nconst cue = new VTTCue(0, 10, '\u003cv Joe\u003eHello world!');\n\n// Output: \u003cspan title=\"Joe\" data-part=\"voice\"\u003eHello world!\u003c/span\u003e\nconst content = renderVTTCueString(cue);\n```\n\nThe second argument accepts the current playback time to add the correct `data-past` and\n`data-future` attributes to timed text (i.e., karaoke-style captions):\n\n```ts\nconst cue = new VTTCue(0, 320, 'Hello my name is \u003c5:20\u003eJoe!');\n\n// Output: Hello my name is \u003cspan data-part=\"timed\" data-time=\"80\" data-future\u003eJoe!\u003c/span\u003e\nrenderVTTCueString(cue, 310);\n\n// Output: Hello my name is \u003cspan data-part=\"timed\" data-time=\"80\" data-past\u003eJoe!\u003c/span\u003e\nrenderVTTCueString(cue, 321);\n```\n\n## `tokenizeVTTCue`\n\nThis function takes a `VTTCue` and returns a collection of VTT tokens based on the cue\ntext. Tokens represent the render nodes for a cue:\n\n```ts\nimport { tokenizeVTTCue, VTTCue } from 'media-captions';\n\nconst cue = new VTTCue(0, 10, '\u003cb.foo.bar\u003e\u003cv Joe\u003eHello world!');\n\nconst tokens = tokenizeVTTCue(cue);\n\n// `tokens` output:\n[\n  {\n    tagName: 'b',\n    type: 'b',\n    class: 'foo bar',\n    children: [\n      {\n        tagName: 'span',\n        type: 'v',\n        voice: 'Joe',\n        children: [{ type: 'text', data: 'Hello world!' }],\n      },\n    ],\n  },\n];\n```\n\nNodes can be a `VTTBlockNode` which can have children (i.e., class, italic, bold, underline,\nruby, ruby text, voice, lang, timestamp) or a `VTTLeafNode` (i.e., text nodes). The tokens\ncan be used for custom rendering like so:\n\n```ts\nfunction renderTokens(tokens: VTTNode[]) {\n  for (const token of tokens) {\n    if (token.type === 'text') {\n      // Process text nodes here...\n      token.data;\n    } else {\n      // Process block nodes here...\n      token.tagName;\n      token.class;\n      token.type === 'v' \u0026\u0026 token.voice;\n      token.type === 'lang' \u0026\u0026 token.lang;\n      token.type === 'timestamp' \u0026\u0026 token.time;\n      token.color;\n      token.bgColor;\n      renderTokens(tokens.children);\n    }\n  }\n}\n```\n\nAll token types are listed below for use in TypeScript:\n\n```ts\nimport type {\n  VTTBlock,\n  VTTBlockNode,\n  VTTBlockType,\n  VTTBoldNode,\n  VTTClassNode,\n  VTTextNode,\n  VTTItalicNode,\n  VTTLangNode,\n  VTTLeafNode,\n  VTTNode,\n  VTTRubyNode,\n  VTTRubyTextNode,\n  VTTTimestampNode,\n  VTTUnderlineNode,\n  VTTVoiceNode,\n} from 'media-captions';\n```\n\n## `renderVTTTokensString`\n\nThis function takes an array of `VTToken` objects and renders them into a string:\n\n```ts\nimport { renderVTTTokensString, tokenizeVTTCue, VTTCue } from 'media-captions';\n\nconst cue = new VTTCue(0, 10, '\u003cv Joe\u003eHello world!');\nconst tokens = tokenizeVTTCue(cue);\n\n// Output: \u003cspan title=\"Joe\" data-part=\"voice\"\u003eHello world!\u003c/span\u003e\nconst result = renderVTTTokensString(tokens);\n```\n\n## `updateTimedVTTCueNodes`\n\nThis function accepts a root DOM node to update all timed text nodes by setting the correct\n`data-future` and `data-past` attributes.\n\n```ts\nimport { updateTimedVTTCueNodes } from 'media-captions';\n\nconst video = document.querySelector('video')!,\n  captions = document.querySelector('#captions')!;\n\nvideo.addEventListener('timeupdate', () =\u003e {\n  updateTimedVTTCueNodes(captions, video.currentTime);\n});\n```\n\nThis can be used when working with karaoke-style captions:\n\n```ts\nconst cue = new VTTCue(300, 308, '\u003c05:00\u003eTimed...\u003c05:05\u003eText!');\n\n// Timed text nodes that would be updated at 303 seconds:\n// \u003cspan data-part=\"timed\" data-time=\"300\" data-past\u003eTimed...\u003c/span\u003e\n// \u003cspan data-part=\"timed\" data-time=\"305\" data-future\u003eText!\u003c/span\u003e\n```\n\n## `CaptionsRenderer`\n\nThe captions overlay renderer is used to render captions over a video player. It follows the\n[WebVTT rendering specification](https://www.w3.org/TR/webvtt1/#rendering) on how regions\nand cues should be visually rendered. It includes:\n\n- Correctly aligning and positioning regions and cues.\n- Processing and applying all region and cue settings.\n- Rendering captions top-down in-order (Cue 1, Cue 2, Cue 3).\n- Rendering roll up captions in regions.\n- Collision detection to avoid overlapping cues.\n- Updating timed text nodes with `data-past` and `data-future` attributes.\n- Updating when the overlay is resized.\n- Applying SSA/ASS styles.\n- Accepts native `VTTCue` objects.\n\n\u003e **Warning**\n\u003e The [styles files](#installation) need to be included for the overlay renderer to work correctly!\n\n```html\n\u003cdiv\u003e\n  \u003cvideo src=\"...\"\u003e\u003c/video\u003e\n  \u003cdiv id=\"captions\"\u003e\u003c/div\u003e\n\u003c/div\u003e\n```\n\n```ts\nimport 'media-captions/styles/captions.css';\nimport 'media-captions/styles/regions.css';\n\nimport { CaptionsRenderer, parseResponse } from 'media-captions';\n\nconst video = document.querySelector('video')!,\n  captions = document.querySelector('#captions')!,\n  renderer = new CaptionsRenderer(captions);\n\nparseResponse(fetch('/media/subs/english.vtt')).then((result) =\u003e {\n  renderer.changeTrack(result);\n});\n\nvideo.addEventListener('timeupdate', () =\u003e {\n  renderer.currentTime = video.currentTime;\n});\n```\n\n**Props**\n\n- `dir`: Sets the text direction (i.e., `ltr` or `rtl`).\n- `currentTime`: Updates the current playback time and schedules a re-render.\n\n**Methods**\n\n- `changeTrack(track: CaptionsRendererTrack)`: Resets the renderer and prepares new regions and cues.\n- `addCue(cue: VTTCue)`: Add a new cue to the renderer.\n- `removeCue(cue: VTTCue)`: Remove a cue from the renderer.\n- `update(forceUpdate: boolean)`: Schedules a re-render to happen.\n- `reset()`: Reset the renderer and clear all internal state including region and cue DOM nodes.\n- `destroy()`: Reset the renderer and destroy internal observers and event listeners.\n\n## Styling\n\nCaptions rendered with the [`CaptionOverlayRenderer`](#captionsoverlayrenderer) can be\neasily customized with CSS. Here are all the parts you can select and customize:\n\n```css\n/* `#captions` assumes you set the id on the captions overlay element. */\n#captions {\n  /* simple CSS vars customization (defaults below) */\n  --overlay-padding: 1%;\n  --cue-color: white;\n  --cue-bg-color: rgba(0, 0, 0, 0.8);\n  --cue-font-size: calc(var(--overlay-height) / 100 * 5);\n  --cue-line-height: calc(var(--cue-font-size) * 1.2);\n  --cue-padding-x: calc(var(--cue-font-size) * 0.6);\n  --cue-padding-y: calc(var(--cue-font-size) * 0.4);\n}\n\n#captions [data-part='region'] {\n}\n\n#captions [data-part='region'][data-active] {\n}\n\n#captions [data-part='region'][data-scroll='up'] {\n}\n\n#captions [data-part='cue-display'] {\n}\n\n#captions [data-part='cue'] {\n}\n\n#captions [data-part='cue'][data-id='...'] {\n}\n\n#captions [data-part='voice'] {\n}\n\n#captions [data-part='voice'][title='Joe'] {\n}\n\n#captions [data-part='timed'] {\n}\n\n#captions [data-part='timed'][data-past] {\n}\n\n#captions [data-part='timed'][data-future] {\n}\n```\n\n## VTT\n\nWeb Video Text Tracks (WebVTT) is the natively supported captions format supported\nby browsers. You can learn more about it on\n[MDN](https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API) or by reading the\n[W3 specification](https://www.w3.org/TR/webvtt1).\n\nWebVTT is a plain-text file that looks something like this:\n\n```text\nWEBVTT\nKind: Language\nLanguage: en-US\n\nREGION id:foo width:100 lines:3 viewportanchor:0%,0% regionanchor:0%,0% scroll:up\n\n1\n00:00 --\u003e 00:02 region:foo\nHello, Joe!\n\n2\n00:02 --\u003e 00:04 region:foo\nHello, Jane!\n```\n\n```ts\nparseResponse(fetch('/subs/english.vtt'), { type: 'vtt' });\n```\n\n\u003e **Warning**\n\u003e The parser will throw in strict parsing mode if the WEBVTT header line is not present.\n\n### VTT Regions\n\nWebVTT supports regions for bounding/positioning cues and implementing roll up captions\nby setting `scroll:up`.\n\n\u003cimg \n  src=\"./assets/vtt-regions.png\" \n  width=\"400px\" \n  alt=\"Visual explanation of VTT regions\" \n/\u003e\n\n\u003cimg \n  src=\"./assets/vtt-region-scroll.png\" \n  width=\"400px\" \n  alt=\"Visual explanation of VTT region scroll up setting for roll up captions\" \n/\u003e\n\n### VTT Cues\n\nWebVTT cues are used for positioning and displaying text. They can snap to lines or be\nfreely positioned as a percentage of the viewport.\n\n```ts\nconst cue = new VTTCue(0, 10, '...');\n\n// Position at line 5 in the video.\n// Lines are calculated using cue line height.\ncue.line = 5;\n\n// 50% from the top and 10% from the left of the video.\ncue.snapToLines = false;\ncue.line = 50;\ncue.position = 10;\n\n// Align cue horizontally at end of line.\ncue.align = 'end';\n// Align top of the cue at the bottom of the line.\ncue.lineAlign = 'end';\n```\n\n\u003cimg \n  src=\"./assets/vtt-cues.png\" \n  width=\"400px\" \n  alt=\"Visual explanation of VTT cues\" \n/\u003e\n\n## SRT\n\nSubRip Subtitle (SRT) is a simple captions format that only contains cues. There are no\nregions or positioning settings as found in [VTT](#vtt).\n\nSRT is a plain-text file that looks like this:\n\n```text\n00:00 --\u003e 00:02,200\nHello, Joe!\n\n00:02,200 --\u003e 00:04,400\nHello, Jane!\n```\n\n```ts\nparseResponse(fetch('/subs/english.srt'), { type: 'srt' });\n```\n\nNote that SRT timestamps use a comma `,` to separate the milliseconds unit unlike VTT which uses\na dot `.`.\n\n## SSA/ASS\n\nSubStation Alpha (SSA) and its successor Advanced SubStation Alpha (ASS) are subtitle formats\ncommonly used for anime content. They allow for rich text formatting, including\ncolor, font size, bold, italic, and underline, as well as more advanced features like karaoke and\ntypesetting.\n\nSSA/ASS is a plain-text file that looks like this:\n\n```text\n[Styles]\nFormat: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding\nStyle: Default,Arial,36,\u0026H00FFFFFF,\u0026H000000FF,\u0026H00000000,\u0026H00000000,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1\n\n[Events]\nFormat: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text\nDialogue: 0,0:00:05.10,0:00:07.20,Default,,0,0,0,,Hello, world!\n\n[Other Events]\nFormat: Start, End, Text\nDialogue: 0:00:04,\\t0:00:07.20, One!\nDialogue: 0:00:05,\\t0:00:08.20, Two!\nDialogue: 0:00:06,\\t0:00:09.20, Three!\nContinue dialogue on a new line.\n```\n\n```ts\nparseResponse(fetch('/subs/english.ssa'), { type: 'ssa' });\n```\n\nThe following features are supported:\n\n- Multiple styles blocks and all format fields (e.g., PrimaryColour, Bold, ScaleX, etc.).\n- Multiple events blocks and associating them with styles.\n\nThe following features are not supported yet:\n\n- Layers\n- Movie\n- Picture\n- Sound\n- Command\n- Font Loading\n- Text Codes (stripped out for now)\n\nIt is very likely we will implement custom font loading, layers, and text codes in the\nnear future. The rest is unlikely for now. You can always try and implement custom transitions\nor animations using CSS (see [Styling](#styling)).\n\nWe recommend using [SubtitlesOctopus](https://github.com/libass/JavascriptSubtitlesOctopus) for\nSSA/ASS captions as it supports most features and is a performant WASM wrapper of\n[libass](https://github.com/libass/libass). You'll need to fall back to this implementation on\niOS Safari (iPhone) as custom captions are not supported there.\n\n## Streaming\n\nYou can split large captions files into chunks and use the [`parseTextStream`](#parsetextstream)\nor [`parseResponse`](#parseresponse) functions to read and parse the stream. Files can be chunked\nhowever you like and don't need to be aligned with line breaks.\n\nHere's an example that chunks and streams a large VTT file on the server:\n\n```ts\nimport fs from 'node:fs';\n\nasync function handle() {\n  const stream = new ReadableStream({\n    start(controller) {\n      const encoder = new TextEncoder();\n      const stream = fs.createReadStream('english.vtt');\n      stream.on('readable', () =\u003e {\n        controller.enqueue(encoder.encode(stream.read()));\n      });\n      stream.on('end', () =\u003e {\n        controller.close();\n      });\n    },\n  });\n\n  return new Response(stream, {\n    headers: {\n      'Content-Type': 'text/vtt; charset=utf-8',\n    },\n  });\n}\n```\n\n## Types\n\nHere's the types that are available from this package for use in TypeScript:\n\n```ts\nimport type {\n  CaptionsFileFormat,\n  CaptionsParser,\n  CaptionsParserInit,\n  CaptionsRenderer,\n  CaptionsRendererTrack,\n  ParseByteStreamOptions,\n  ParseCaptionsOptions,\n  ParsedCaptionsResult,\n  ParseError,\n  ParseErrorCode,\n  ParseErrorInit,\n  TextCue,\n  VTTCue,\n  VTTCueTemplate,\n  VTTHeaderMetadata,\n  VTTRegion,\n} from 'media-captions';\n```\n\n## 📝 License\n\nMedia Captions is [MIT licensed](./LICENSE).\n\n[package]: https://www.npmjs.com/package/media-captions\n[package-badge]: https://img.shields.io/npm/v/media-captions/next?style=flat-square\n[discord]: https://discord.com/invite/7RGU7wvsu9\n[discord-badge]: https://img.shields.io/discord/742612686679965696?color=%235865F2\u0026label=%20\u0026logo=discord\u0026logoColor=white\u0026style=flat-square\n[stackblitz-demo]: https://stackblitz.com/edit/media-captions?embed=1\u0026file=src/main.ts\u0026hideNavigation=1\u0026showSidebar=1\n[demuxed]: https://demuxed.com\n[caption-me-talk]: https://www.youtube.com/watch?v=Z0HqYQqdErE\n[mozilla-vtt]: https://github.com/mozilla/vtt.js\n[vidstack-player]: https://github.com/vidstack/player\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvidstack%2Fcaptions","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvidstack%2Fcaptions","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvidstack%2Fcaptions/lists"}