https://github.com/liamnichols/nltool

A CLI interface for Apples NaturalLanguage framework
https://github.com/liamnichols/nltool

cli macos natural-language-processing nlp nlp-machine-learning tagger tokenizer

Last synced: about 2 months ago
JSON representation

A CLI interface for Apples NaturalLanguage framework

Host: GitHub
URL: https://github.com/liamnichols/nltool
Owner: liamnichols
License: mit
Created: 2019-11-09T22:41:43.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2019-11-13T20:17:30.000Z (over 6 years ago)
Last Synced: 2025-08-30T18:32:47.437Z (10 months ago)
Topics: cli, macos, natural-language-processing, nlp, nlp-machine-learning, tagger, tokenizer
Language: Swift
Size: 20.5 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

          # NLTool

A command line interface wrapper around methods from Apple's NaturalLanguage framework.

## Installation

```

mint install liamnichols/nltool

```

## Usage

Use `nltool --help` for a list of commands and options. The two main supported commands are `tokenize` and `tagger tag`.

### Examples

Pipe an input string and tag its contents:

```

$ curl -s http://whatthecommit.com/index.txt | nltool tagger tag LexicalClass --omit-whitespace --omit-punctuation

+------------------------------------+

| Tags for Lexicalclass (Word)       |

+------------------------------------+

| Index | Range   | Tag      | Value |

+-------+---------+----------+-------+

| 0     | 0..<5   | Adverb   | Never |

| 1     | 6..<11  | Verb     | gonna |

| 2     | 12..<16 | Verb     | give  |

| 3     | 17..<20 | Pronoun  | you   |

| 4     | 21..<23 | Particle | up    |

+-------+---------+----------+-------+

```

See the built-in tag schemes available for a given token unit and language:

```

$ nltool tagger availableTagSchemes word en

+-------------------------------------+

| Available Tag Schemes for Word (en) |

+-------------------------------------+

| Language                            |

| Script                              |

| TokenType                           |

| NameType                            |

| LexicalClass                        |

| NameTypeOrLexicalClass              |

| Lemma                               |

+-------------------------------------+

```

Output the results in JSON format:

```

$ nltool tokenize "First sentence. Second Sentence" --unit sentence --json --pretty-print

{

  "input" : "First sentence. Second Sentence",

  "tokens" : [

    {

      "attributes" : [

      ],

      "range" : [

        0,

        16

      ],

      "value" : "First sentence. "

    },

    {

      "attributes" : [

      ],

      "range" : [

        16,

        31

      ],

      "value" : "Second Sentence"

    }

  ],

  "unit" : "sentence"

}

```

View help infromation for any command with `--help`:

```

$ nltool tagger tag --help                                                                   

Usage: nltool tagger tag  [] [options]

Tags the input string against the configured tag schemes

Options:

  --join-contractions    Contractions will be returned as one token.

  --join-names           Typically, multiple-word names will be returned as multiple tokens, following the standard tokenization practice of the tagger.

  --json                 Print output in JSON format

  --omit-other           Omit tokens of type Other (non-linguistic items, such as symbols).

  --omit-punctuation     Omit tokens of type Punctuation (all punctuation).

  --omit-whitespace      Omit tokens of type Whitespace (whitespace of all sorts).

  --omit-words           Omit tokens of type Word (items considered to be words).

  --pretty-print         Pretty Print JSON output when using --json command

  -h, --help             Show help information

  -u, --unit      Unit segmentation to tokenize by. Default is 'word'

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/liamnichols/nltool

Awesome Lists containing this project

README