Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alordash/parse-word-to-number
Extracts numbers written as words from string.
https://github.com/alordash/parse-word-to-number
english-language natural-language-processing numbers parsing russian-language
Last synced: 19 days ago
JSON representation
Extracts numbers written as words from string.
- Host: GitHub
- URL: https://github.com/alordash/parse-word-to-number
- Owner: alordash
- License: mit
- Created: 2020-08-09T11:14:59.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-03-11T17:26:37.000Z (over 2 years ago)
- Last Synced: 2024-10-11T19:45:14.585Z (about 1 month ago)
- Topics: english-language, natural-language-processing, numbers, parsing, russian-language
- Language: JavaScript
- Homepage: https://www.npmjs.com/package/@alordash/parse-word-to-number
- Size: 82 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# @alordash/parse-word-to-number
```
$ npm i @alordash/parse-word-to-number
```# Description
Parses string and returns numbers written as words inside it.
It uses my realization of [Damerau-Levenshtein algorithm](https://github.com/alordash/damerau-levenshtein) to properly parse words even if they are written with mistakes.
**Supports Russian and English language.**# Usage
### Function parseWord(string, errorLimit):{Array.\}
#### Arguments
1. string {**String**} — source string.
2. errorLimit {**Number**} — From 0.0 to 1.0, the less — the less results. Used for recognizing words with mistakes.
Parses all words in that string into numbers.
Returns all found numbers.
#### Usage example:```javascript
const { parseWord } = require('@alordash/parse-word-to-number');//Parse single word
let parsedWord = parseWord("twonty-one");
console.log(parsedWord[0].value);
//=> 20
console.log(parsedWord[1].value);
//=> 1parsedWord = parseWord("читырэ");
console.log(parsedWord[0].value);
//=> 4//You can specify mistakes multiplication from 0.0 and on with second argument, where
//0 — do not accept words with mistakes,
//1 — accept words if error < error limit for that word
//List of limits for all words is located in /lib/expressions/*.csv files
parsedWord = parseWord("hundrid", 1);
console.log(parserWord[0].value);
//=> 100parsedWord = parseWord("hundrid", 0);
console.log(parserWord[0]);
//=> undefined
```### Function parseString(string, errorLimit):{String}
#### Arguments
1. string {**String**} — source string.
2. errorLimit {**Number**} — From 0.0 to 1.0, the less — the less results. Used for recognizing words with mistakes.
Parses all words in that string into numbers and combines them.
Returns string with parsed numbers.
#### Usage example:
```javascript
const { parseString } = require('@alordash/parse-word-to-number');console.log(parseString("four-huntred-sevinty-six balloons"));
//=> 476 balloonsconsole.log(parseString("двести дивяносто пять тысоч ложек сто восмьдесят три тарелки"));
//=> 295000 ложек 183 тарелки//Mistakes multiplication
console.log(parseString("four-huntred-sevinty-six balloons", 0));
//=> 4 balloonsconsole.log(parseString("двести дивяносто пять тысоч ложек сто восмьдесят три тарелки", 0));
//=> 200 дивяносто 5 тысоч ложек 100 восмьдесят 3 тарелки
```### Getting array of ConvertedWords
#### Class ConvertedWord
```javascript
class ConvertedWord {
//@type {String}
text; //Text of word
//@type {Array.}
indexes; //Indexes of used words from original string
}
```### Function arrayParseString(string, errorLimit):{Array.}
Works the same as parseString function and accepts same arguments, except it returns array of converted words.
#### Usage example:
```javascript
const { parseString } = require('@alordash/parse-word-to-number');let result = arrayParseString("four huntred sevinty-six balloons");
console.log(JSON.stringify(result));
//=> [{"text":"476","indexes":[0,1,2]},{"text":"balloons","indexes":[3]}]result = arrayParseString("двести дивяносто пять тысоч ложек сто восмьдесят три тарелки");
console.log(JSON.stringify(result));
//=> [{"text":"295000","indexes":[0,1,2,3]},{"text":"ложек","indexes":[4]},{"text":"183","indexes":[5,6,7]},{"text":"тарелки","indexes":[8]}]
```# Adding custom expressions
You can add new expressions for parsing more cases by creating .csv file inside [lib/expressions](https://github.com/alordash/parse-word-to-number/tree/master/lib/expressions) folder.
Fill new .csv file with following format:
```
META;;;;
separators;;;;
%separators_symbols% (for example I'm using "-" as separator for English);;;;
text;value;multiply level;errors limit;rank
String;Number;Number;Number;Number
```
For better understanding see example .csv files located in [lib/expressions](https://github.com/alordash/parse-word-to-number/tree/master/lib/expressions) folder.