Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/digitallinguistics/scription2dlx
A JavaScript library that converts scription text files to the Data Format for Digital Linguistics
https://github.com/digitallinguistics/scription2dlx
digital-humanities digital-linguistics dlx documentary-linguistics language language-documentation linguistics scription
Last synced: 22 days ago
JSON representation
A JavaScript library that converts scription text files to the Data Format for Digital Linguistics
- Host: GitHub
- URL: https://github.com/digitallinguistics/scription2dlx
- Owner: digitallinguistics
- License: mit
- Created: 2019-03-16T00:18:50.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2024-10-11T02:19:40.000Z (2 months ago)
- Last Synced: 2024-11-16T19:19:55.385Z (about 1 month ago)
- Topics: digital-humanities, digital-linguistics, dlx, documentary-linguistics, language, language-documentation, linguistics, scription
- Language: JavaScript
- Homepage: https://developer.digitallinguistics.io/scription2dlx
- Size: 733 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# scription2dlx
[![GitHub version](https://img.shields.io/github/v/release/digitallinguistics/scription2dlx?label=version)][releases]
[![downloads](https://img.shields.io/npm/dt/@digitallinguistics/scription2dlx.svg)][npm]
[![GitHub issues](https://img.shields.io/github/issues/digitallinguistics/scription2dlx.svg)][issues]
[![tests status](https://github.com/digitallinguistics/scription2dlx/workflows/tests/badge.svg)][actions]
[![license](https://img.shields.io/github/license/digitallinguistics/scription2dlx.svg)][license]
[![DOI](https://zenodo.org/badge/175907357.svg)][Zenodo]
[![GitHub stars](https://img.shields.io/github/stars/digitallinguistics/scription2dlx.svg?style=social)][GitHub]A JavaScript library that converts linguistic texts in [scription format][scription] to the [Data Format for Digital Linguistics (DaFoDiL)][DaFoDiL]. This library is useful for language researchers who want to work with their data in text formats that are simple to type and read ([scription][scription]), but want to convert their data for use in other [Digital Linguistics][DLx] tools.
## Quick Links
* [Report a bug or request a feature][issues]
* [View project on GitHub][GitHub]
* [View project on npm][npm]
* [Download the latest release][releases]## Contents
* [Basic Usage](#basic-usage)
* [Notes](#notes)
* [Options](#options)## Basic Usage
1. Install the library using npm or yarn:
```cmd
npm i @digitallinguistics/scription2dlx
yarn add @digitallinguistics/scription2dlx
```Or download the latest release from the [releases page][releases].
1. Import the library into your project:
**Node:**
```js
import convert from '@digitallinguistics/scription2dlx';
```**HTML:**
```html
```1. The library exports a single function which accepts a string and returns a [DaFoDiL Text Object][Text].
**data.txt**
```
---
title: How the world began
---
waxdungu qasi
one day a man
```**script.js**
```js
const data = await fetch(`data.txt`);
const text = scription2dlx(data);console.log(text.utterances.transcription); // "waxdungu qasi"
```You may also pass an options hash as the second option. See the [Options](#options) section below.
```js
const text = scription2dlx(data, { /* options */ });
```## Notes
* If your project does not support ES modules and/or the latest JavaScript syntax, you may need to transpile this library using tools like [Babel][Babel], and possibly bundle the library using a [JavaScript bundler][bundlers].
* The `scription2dlx` library does **not** perform validation on the text data. You should use another validator like [AJV][AJV] to validate your data against the [DLx DaFoDiL format][DaFoDiL].
* In order to keep this library small and dependency-free, `scription2dlx` does **not** automatically parse the YAML header of a scription document. Instead, the header string is returned as a `header` property on the text object. If you would like `scription2dlx` to parse the header, pass a YAML parser to the `parser` option when calling the `scription2dlx` function:
```js
import yaml from 'yaml'; // use your preferred YAML parsing libraryconst text = scription2dlx(data, { parser: yaml.parse });
```## Options
| Option | Default | Description |
| ------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `codes` | `{}` | This option allows you to use custom backslash codes in your interlinear glosses. It should be a hash containing the scription code as a key (without a leading backslash), and the custom code as the value; ex: `"txn": "t"` will allow you to write `\t` instead of `\txn` for transcription lines. |
| `emphasis` | `true` | This option specifies whether emphasis should be passed through as-is (`true`, default), or stripped from the data (`false`).
| `errors` | `"warn"` | This option allows you to specify how to handle errors. If set to `"warn""` (the default), an utterance which throws an error is skipped and a warning is logged to the console. If set to `"object"`, an error object with information is returned in the results array. If set to `false`, utterances with errors will be skipped silently. If set to `true`, utterances with errors will throw and stop further processing. |
| `orthography` | `"default"` | An abbreviation for the default orthography to use for transcriptions when one is not specified. |
| `parser` | `undefined` | A YAML parser to use in parsing the header of a scription document. If none is present, the header will be provided as a string in the `header` property of the returned object. |
| `utteranceMetadata` | `true` | Whether to parse the utterance metadata line (the first line when it begins with `#`). If set to `true`, a `metadata` property will be added to each utterance that has it. |[actions]: https://github.com/digitallinguistics/scription2dlx/actions/
[AJV]: https://www.npmjs.com/package/ajv
[Babel]: https://babeljs.io/
[bundlers]: https://blog.bitsrc.io/choosing-the-right-javascript-bundler-in-2020-f9b1eae0d12b
[DaFoDiL]: https://format.digitallinguistics.io
[DLx]: https://digitallinguistics.io
[GitHub]: https://github.com/digitallinguistics/scription2dlx
[license]: https://github.com/digitallinguistics/scription2dlx/blob/master/LICENSE.md
[issues]: https://github.com/digitallinguistics/scription2dlx/issues
[npm]: https://www.npmjs.com/package/@digitallinguistics/scription2dlx
[releases]: https://github.com/digitallinguistics/scription2dlx/releases
[scription]: https://scription.digitallinguistics.io
[Text]: https://format.digitallinguistics.io/schemas/Text.html
[Zenodo]: https://zenodo.org/badge/latestdoi/175907357