Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/digitallinguistics/scription2dlx

A JavaScript library that converts scription text files to the Data Format for Digital Linguistics
https://github.com/digitallinguistics/scription2dlx

digital-humanities digital-linguistics dlx documentary-linguistics language language-documentation linguistics scription

Last synced: 22 days ago
JSON representation

A JavaScript library that converts scription text files to the Data Format for Digital Linguistics

Awesome Lists containing this project

README

        

# scription2dlx

[![GitHub version](https://img.shields.io/github/v/release/digitallinguistics/scription2dlx?label=version)][releases]
[![downloads](https://img.shields.io/npm/dt/@digitallinguistics/scription2dlx.svg)][npm]
[![GitHub issues](https://img.shields.io/github/issues/digitallinguistics/scription2dlx.svg)][issues]
[![tests status](https://github.com/digitallinguistics/scription2dlx/workflows/tests/badge.svg)][actions]
[![license](https://img.shields.io/github/license/digitallinguistics/scription2dlx.svg)][license]
[![DOI](https://zenodo.org/badge/175907357.svg)][Zenodo]
[![GitHub stars](https://img.shields.io/github/stars/digitallinguistics/scription2dlx.svg?style=social)][GitHub]

A JavaScript library that converts linguistic texts in [scription format][scription] to the [Data Format for Digital Linguistics (DaFoDiL)][DaFoDiL]. This library is useful for language researchers who want to work with their data in text formats that are simple to type and read ([scription][scription]), but want to convert their data for use in other [Digital Linguistics][DLx] tools.

## Quick Links

* [Report a bug or request a feature][issues]
* [View project on GitHub][GitHub]
* [View project on npm][npm]
* [Download the latest release][releases]

## Contents

* [Basic Usage](#basic-usage)
* [Notes](#notes)
* [Options](#options)

## Basic Usage

1. Install the library using npm or yarn:

```cmd
npm i @digitallinguistics/scription2dlx
yarn add @digitallinguistics/scription2dlx
```

Or download the latest release from the [releases page][releases].

1. Import the library into your project:

**Node:**

```js
import convert from '@digitallinguistics/scription2dlx';
```

**HTML:**

```html

```

1. The library exports a single function which accepts a string and returns a [DaFoDiL Text Object][Text].

**data.txt**

```
---
title: How the world began
---
waxdungu qasi
one day a man
```

**script.js**

```js
const data = await fetch(`data.txt`);
const text = scription2dlx(data);

console.log(text.utterances.transcription); // "waxdungu qasi"
```

You may also pass an options hash as the second option. See the [Options](#options) section below.

```js
const text = scription2dlx(data, { /* options */ });
```

## Notes

* If your project does not support ES modules and/or the latest JavaScript syntax, you may need to transpile this library using tools like [Babel][Babel], and possibly bundle the library using a [JavaScript bundler][bundlers].

* The `scription2dlx` library does **not** perform validation on the text data. You should use another validator like [AJV][AJV] to validate your data against the [DLx DaFoDiL format][DaFoDiL].

* In order to keep this library small and dependency-free, `scription2dlx` does **not** automatically parse the YAML header of a scription document. Instead, the header string is returned as a `header` property on the text object. If you would like `scription2dlx` to parse the header, pass a YAML parser to the `parser` option when calling the `scription2dlx` function:

```js
import yaml from 'yaml'; // use your preferred YAML parsing library

const text = scription2dlx(data, { parser: yaml.parse });
```

## Options

| Option | Default | Description |
| ------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `codes` | `{}` | This option allows you to use custom backslash codes in your interlinear glosses. It should be a hash containing the scription code as a key (without a leading backslash), and the custom code as the value; ex: `"txn": "t"` will allow you to write `\t` instead of `\txn` for transcription lines. |
| `emphasis` | `true` | This option specifies whether emphasis should be passed through as-is (`true`, default), or stripped from the data (`false`).
| `errors` | `"warn"` | This option allows you to specify how to handle errors. If set to `"warn""` (the default), an utterance which throws an error is skipped and a warning is logged to the console. If set to `"object"`, an error object with information is returned in the results array. If set to `false`, utterances with errors will be skipped silently. If set to `true`, utterances with errors will throw and stop further processing. |
| `orthography` | `"default"` | An abbreviation for the default orthography to use for transcriptions when one is not specified. |
| `parser` | `undefined` | A YAML parser to use in parsing the header of a scription document. If none is present, the header will be provided as a string in the `header` property of the returned object. |
| `utteranceMetadata` | `true` | Whether to parse the utterance metadata line (the first line when it begins with `#`). If set to `true`, a `metadata` property will be added to each utterance that has it. |

[actions]: https://github.com/digitallinguistics/scription2dlx/actions/
[AJV]: https://www.npmjs.com/package/ajv
[Babel]: https://babeljs.io/
[bundlers]: https://blog.bitsrc.io/choosing-the-right-javascript-bundler-in-2020-f9b1eae0d12b
[DaFoDiL]: https://format.digitallinguistics.io
[DLx]: https://digitallinguistics.io
[GitHub]: https://github.com/digitallinguistics/scription2dlx
[license]: https://github.com/digitallinguistics/scription2dlx/blob/master/LICENSE.md
[issues]: https://github.com/digitallinguistics/scription2dlx/issues
[npm]: https://www.npmjs.com/package/@digitallinguistics/scription2dlx
[releases]: https://github.com/digitallinguistics/scription2dlx/releases
[scription]: https://scription.digitallinguistics.io
[Text]: https://format.digitallinguistics.io/schemas/Text.html
[Zenodo]: https://zenodo.org/badge/latestdoi/175907357