https://github.com/fergiemcdowall/term-vector
A node.js module that creates a term vector from a mixed text input. Supports stopword removal and customisable separators.
https://github.com/fergiemcdowall/term-vector
Last synced: 29 days ago
JSON representation
A node.js module that creates a term vector from a mixed text input. Supports stopword removal and customisable separators.
- Host: GitHub
- URL: https://github.com/fergiemcdowall/term-vector
- Owner: fergiemcdowall
- License: mit
- Created: 2015-03-25T09:19:12.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2024-11-28T11:14:34.000Z (5 months ago)
- Last Synced: 2025-03-31T06:03:42.387Z (about 1 month ago)
- Language: JavaScript
- Homepage:
- Size: 453 KB
- Stars: 19
- Watchers: 4
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[![NPM version][npm-version-image]][npm-url] [![NPM downloads][npm-downloads-image]][npm-url] [![MIT License][license-image]][license-url]
# term-vector
A node.js module that creates a term vector from tokenized text. Use `term-vector` when implementing a [vector space model](http://en.wikipedia.org/wiki/Vector_space_model)**Works with Unicode!**
**Does ngrams!**
```javascript
const tv = require('term-vector')
// alternatively if you are all fancy and new-fangled:
// import tv from 'term-vector'
const tokens = 'this is really really really cool'.split(' ')// just make a simple term vector
tv(tokens)
// [
// { term: [ 'cool' ], positions: [ 5 ] },
// { term: [ 'is' ], positions: [ 1 ] },
// { term: [ 'really' ], positions: [ 2, 3, 4 ] },
// { term: [ 'this' ], positions: [ 0 ] }
// ]// make a term vector with ngrams of length 1 and 2
tv(tokens, { ngramLengths: [ 1, 2 ] })
// [
// { term: [ 'cool' ], positions: [ 5 ] },
// { term: [ 'is' ], positions: [ 1 ] },
// { term: [ 'is', 'really' ], positions: [ 1 ] },
// { term: [ 'really' ], positions: [ 2, 3, 4 ] },
// { term: [ 'really', 'cool' ], positions: [ 4 ] },
// { term: [ 'really', 'really' ], positions: [ 2, 3 ] },
// { term: [ 'this' ], positions: [ 0 ] },
// { term: [ 'this', 'is' ], positions: [ 0 ] }
// ]```
[license-image]: http://img.shields.io/badge/license-MIT-blue.svg?style=flat
[license-url]: LICENSE[npm-url]: https://npmjs.org/package/term-vector
[npm-version-image]: http://img.shields.io/npm/v/term-vector.svg?style=flat
[npm-downloads-image]: http://img.shields.io/npm/dm/term-vector.svg?style=flat