Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/AleksiKnuutila/nlp-highlight

Semantic highlighting of natural language for readability
https://github.com/AleksiKnuutila/nlp-highlight

Last synced: about 2 months ago
JSON representation

Semantic highlighting of natural language for readability

Awesome Lists containing this project

README

        

# nlp-highlight
Semantic highlighting of natural language for readability

Programmers in particular are familiar with *syntax highlighting* of programming languages, colouring text to indicate the syntactical significance of words and characters. This project is a brief experiment to see whether a similar approach could improve readability of natural language.

My hypothesis is that for such highlighting to be useful, it would have to be based on the right kind of semantic distinctions, instead of syntactical ones. Adding semantic metadata to text is a hard problem, but current NLP tools have at least an indicative level of accuracy.

The demo uses the NLP SaaS TextRazor, and particularly the classifications in its entity recognition API. To make the interpretation of the highlighting sufficiently simple, the classifications are mapped to five semantic categories: people, organisations, places, numbers and quotes.

## Usage

`scripts/getjson.js` makes a request from the TextRazor API and saves the result as JSON that is later consumed to do the highlighting. The script takes the TextRazor API key as an argument, reads the analysed text from standard input and prints the JSON. An example of usage is:

```
cat article.txt | node scripts/getjson.js api-key > assets/article-analysed.js
```

The location of the JSON data is indicated in the first lines of `app.js`.

To compile the demo, run:

```
npm install
gulp
```

## Demo

https://aleksiknuutila.github.io/nlp-highlight/