{"id":13494926,"url":"https://github.com/winkjs/wink-nlp-utils","last_synced_at":"2025-04-06T13:10:14.314Z","repository":{"id":44677654,"uuid":"91115810","full_name":"winkjs/wink-nlp-utils","owner":"winkjs","description":"NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.","archived":false,"fork":false,"pushed_at":"2024-03-03T07:32:27.000Z","size":3128,"stargazers_count":111,"open_issues_count":3,"forks_count":12,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-04-14T07:52:53.756Z","etag":null,"topics":["bag-of-words","natural-language-processing","ngrams","nlp","phonetize","sentence-boundary-detection","stem","stop-words","tokenize"],"latest_commit_sha":null,"homepage":"http://winkjs.org/wink-nlp-utils/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/winkjs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-05-12T17:44:31.000Z","updated_at":"2024-05-10T20:11:40.552Z","dependencies_parsed_at":"2024-05-10T20:11:39.999Z","dependency_job_id":"7c6235ed-69ad-45bb-b210-ad040294357d","html_url":"https://github.com/winkjs/wink-nlp-utils","commit_stats":{"total_commits":133,"total_committers":4,"mean_commits":33.25,"dds":0.5338345864661654,"last_synced_commit":"9e6f28e7815b493c5632d249e37afd22ac56a0c2"},"previous_names":[],"tags_count":25,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/winkjs%2Fwink-nlp-utils","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/winkjs%2Fwink-nlp-utils/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/winkjs%2Fwink-nlp-utils/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/winkjs%2Fwink-nlp-utils/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/winkjs","download_url":"https://codeload.github.com/winkjs/wink-nlp-utils/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247485287,"owners_count":20946398,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bag-of-words","natural-language-processing","ngrams","nlp","phonetize","sentence-boundary-detection","stem","stop-words","tokenize"],"created_at":"2024-07-31T19:01:29.532Z","updated_at":"2025-04-06T13:10:14.279Z","avatar_url":"https://github.com/winkjs.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"\n# wink-nlp-utils\n\nNLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.\n\n### [![Build Status](https://app.travis-ci.com/winkjs/wink-nlp-utils.svg?branch=master)](https://app.travis-ci.com/github/winkjs/wink-nlp-utils) [![Coverage Status](https://coveralls.io/repos/github/winkjs/wink-nlp-utils/badge.svg?branch=master)](https://coveralls.io/github/winkjs/wink-nlp-utils?branch=master) [![Gitter](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/winkjs/Lobby)\n\n[\u003cimg align=\"right\" src=\"https://decisively.github.io/wink-logos/logo-title.png\" width=\"100px\" \u003e](http://wink.org.in/)\n\nPrepare raw text for Natural Language Processing (NLP) using **`wink-nlp-utils`**. It offers a set of [APIs](http://wink.org.in/wink-nlp-utils/) to work on [strings](http://wink.org.in/wink-nlp-utils/#string) such as names, sentences, paragraphs and [tokens](http://wink.org.in/wink-nlp-utils/#tokens) represented as an array of strings/words. They perform the required pre-processing for many ML tasks such as [semantic search](https://www.npmjs.com/package/wink-bm25-text-search), and [classification](https://www.npmjs.com/package/wink-naive-bayes-text-classifier).\n\n\u003ctable\u003e\u003ctr\u003e\u003ctd\u003e\u003ch3\u003e👉🏽\u003c/h3\u003e\u003c/td\u003e\u003ctd\u003e\n    We \u003cb\u003erecommend using \u003ca href=\"https://github.com/winkjs/wink-nlp?tab=readme-ov-file#readme\"\u003ewinkNLP\u003c/a\u003e\u003c/b\u003e for core natural language processing tasks. \u003cbr/\u003e\u003cbr/\u003eIt \u003ca href=\"https://winkjs.org/wink-nlp/document.html\"\u003eperforms\u003c/a\u003e Tokenization, Sentence Boundary Detection, and Named Entity Recognition at a \u003ca href=\"https://observablehq.com/@winkjs/how-to-measure-winknlps-speed-on-browsers\"\u003eblazing fast speeds\u003c/a\u003e. It supports all your \u003ca href=\"https://winkjs.org/wink-nlp/leveraging-out.html\"\u003etext processing needs\u003c/a\u003e starting from Sentiment Analysis, POS Tagging, Lemmatization, Stemming, Stop Word Removal, Negation Handling, Bigrams to Frequency Table Creation and more. \u003cbr/\u003e\u003cbr/\u003e\u003cb\u003e\u003ca href=\"https://github.com/winkjs/wink-nlp?tab=readme-ov-file#readme\"\u003eWinkNLP\u003c/a\u003e\u003c/b\u003e features user-friendly declarative APIs for \u003ca href=\"https://winkjs.org/wink-nlp/each.html\"\u003eIteration\u003c/a\u003e, \u003ca href=\"https://winkjs.org/wink-nlp/filter.html\"\u003eFiltering\u003c/a\u003e, and \u003ca href=\"https://winkjs.org/wink-nlp/visualizing-markup.html\"\u003eText Visualization\u003c/a\u003e, and \u003ca href=\"https://winkjs.org/wink-nlp/wink-nlp-in-browsers.html\"\u003eruns\u003c/a\u003e on web browsers.\n\u003c/td\u003e\u003c/tr\u003e\u003c/table\u003e\n\n### Installation\nUse [npm](https://www.npmjs.com/package/wink-nlp-utils) to install:\n```\nnpm install wink-nlp-utils --save\n```\n\n\n### Getting Started\nThe `wink-nlp-utils` provides over **36 utility functions** for Natural Language Processing tasks. Some representative examples are extracting person's name from a string, compose training corpus for a chat bot, sentence boundary detection, tokenization and stop words removal:\n```javascript\n\n// Load wink-nlp-utils\nvar nlp = require( 'wink-nlp-utils' );\n\n// Extract person's name from a string:\nvar name = nlp.string.extractPersonsName( 'Dr. Sarah Connor M. Tech., PhD. - AI' );\nconsole.log( name );\n// -\u003e 'Sarah Connor'\n\n// Compose all possible sentences from a string:\nvar str = '[I] [am having|have] [a] [problem|question]';\nconsole.log( nlp.string.composeCorpus( str ) );\n// -\u003e [ 'I am having a problem',\n// -\u003e   'I am having a question',\n// -\u003e   'I have a problem',\n// -\u003e   'I have a question' ]\n\n// Sentence Boundary Detection.\nvar para = 'AI Inc. is focussing on AI. I work for AI Inc. My mail is r2d2@yahoo.com';\nconsole.log( nlp.string.sentences( para ) );\n// -\u003e [ 'AI Inc. is focussing on AI.',\n//      'I work for AI Inc.',\n//      'My mail is r2d2@yahoo.com' ]\n\n// Tokenize a sentence.\nvar s = 'For details on wink, check out http://winkjs.org/ URL!';\nconsole.log( nlp.string.tokenize( s, true ) );\n// -\u003e [ { value: 'For', tag: 'word' },\n//      { value: 'details', tag: 'word' },\n//      { value: 'on', tag: 'word' },\n//      { value: 'wink', tag: 'word' },\n//      { value: ',', tag: 'punctuation' },\n//      { value: 'check', tag: 'word' },\n//      { value: 'out', tag: 'word' },\n//      { value: 'http://winkjs.org/', tag: 'url' },\n//      { value: 'URL', tag: 'word' },\n//      { value: '!', tag: 'punctuation' } ]\n\n// Remove stop words:\nvar t = nlp.tokens.removeWords( [ 'mary', 'had', 'a', 'little', 'lamb' ] );\nconsole.log( t );\n// -\u003e [ 'mary', 'little', 'lamb' ]\n\n```\n\nTry [experimenting with these examples on Runkit](https://npm.runkit.com/wink-nlp-utils) in the browser.\n\n### Documentation\nCheck out the [wink NLP utilities API](http://winkjs.org/wink-nlp-utils/) documentation to learn more.\n\n### Need Help?\nIf you spot a bug and the same has not yet been reported, raise a new [issue](https://github.com/winkjs/wink-nlp-utils/issues) or consider fixing it and sending a pull request.\n\n### About wink\n[Wink](http://winkjs.org/) is a family of open source packages for **Statistical Analysis**, **Natural Language Processing** and **Machine Learning** in NodeJS. The code is **thoroughly documented** for easy human comprehension and has a **test coverage of ~100%** for reliability to build production grade solutions.\n\n\n### Copyright \u0026 License\n**wink-nlp-utils** is copyright 2017-22 [GRAYPE Systems Private Limited](http://graype.in/).\n\nIt is licensed under the terms of the MIT License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwinkjs%2Fwink-nlp-utils","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwinkjs%2Fwink-nlp-utils","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwinkjs%2Fwink-nlp-utils/lists"}