{"id":13661667,"url":"https://github.com/FinNLP/en-pos","last_synced_at":"2025-04-25T03:30:46.374Z","repository":{"id":57225044,"uuid":"78905741","full_name":"FinNLP/en-pos","owner":"FinNLP","description":"⚙️ [Processor] A better English POS tagger written in JavaScript","archived":false,"fork":false,"pushed_at":"2017-04-09T19:03:58.000Z","size":4556,"stargazers_count":54,"open_issues_count":5,"forks_count":8,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-02T11:38:39.831Z","etag":null,"topics":["english-pos-tagger","part-of-speech","part-of-speech-tagger","pos","pos-tagger"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FinNLP.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-01-14T02:39:32.000Z","updated_at":"2025-03-13T11:18:25.000Z","dependencies_parsed_at":"2022-08-24T11:01:07.750Z","dependency_job_id":null,"html_url":"https://github.com/FinNLP/en-pos","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FinNLP%2Fen-pos","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FinNLP%2Fen-pos/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FinNLP%2Fen-pos/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FinNLP%2Fen-pos/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FinNLP","download_url":"https://codeload.github.com/FinNLP/en-pos/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250747673,"owners_count":21480691,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["english-pos-tagger","part-of-speech","part-of-speech-tagger","pos","pos-tagger"],"created_at":"2024-08-02T05:01:39.142Z","updated_at":"2025-04-25T03:30:45.581Z","avatar_url":"https://github.com/FinNLP.png","language":"TypeScript","funding_links":[],"categories":["TypeScript"],"sub_categories":[],"readme":"# en-pos\r\nA better English POS tagger written in JavaScript\r\n\r\n### Installation and usage\r\n\r\nInstall via NPM:\r\n\r\n```\r\nnpm i --save en-pos\r\n```\r\n\r\n**How to use**\r\n\r\n```javascript\r\nconst Tag = require(\"en-pos\").Tag;\r\nvar tags = new Tag([\"this\",\"is\",\"my\",\"sentence\"])\r\n.initial() // initial dictionary and pattern based tagging\r\n.smooth() // further context based smoothing\r\n.tags;\r\nconsole.log(tags);\r\n// [\"DT\",\"VBZ\",\"PRP$\",\"NN\"]\r\n```\r\n\r\n## Annotation Specification\r\n\r\nAnnotation | Name | Example\r\n--- | --- | ---\r\n**`NN`** | Noun | `dog` `man`\r\n**`NNS`** | Plural noun | `dogs` `men`\r\n**`NNP`** | Proper noun | `London` `Alex`\r\n**`NNPS`** | Plural proper noun | `Smiths`\r\n**`VB`** | Base form verb | `be`\r\n**`VBP`** | Present form verb | `throw`\r\n**`VBZ`** | Present form (3rd person) | `throws`\r\n**`VBG`** | Gerund form verb | `throwing`\r\n**`VBD`** | Past tense verb | `threw`\r\n**`VBN`** | Past participle verb | `thrown`\r\n**`MD`** | Modal verb | `can` `shall` `will` `may` `must` `ought`\r\n**`JJ`** | Adjective | `big` `fast`\r\n**`JJR`** | Comparative adjective | `bigger`\r\n**`JJS`** | Superlative adjective | `biggest`\r\n**`RB`** | Adverb | `not` `quickly` `closely`\r\n**`RBR`** | Comparative adverb | `less-closely` `faster`\r\n**`RBS`** | Superlative adverb | `fastest`\r\n**`DT`** | Determiner | `the` `a` `some` `both`\r\n**`PDT`** | Predeterminer | `all` `quite`\r\n**`PRP`** | Personal Pronoun | `I` `you` `he` `she`\r\n**`PRP$`** | Possessive Pronoun | `I` `you` `he` `she`\r\n**`POS`** | Possessive ending | `'s`\r\n**`IN`** | Preposition | `of` `by` `in`\r\n**`PR`** | Particle | `up` `off`\r\n**`TO`** | *to* | `to`\r\n**`WDT`** | Wh-determiner | `which` `that` `whatever` `whichever`\r\n**`WP`** | Wh-pronoun | `who` `whoever` `whom` `what`\r\n**`WP$`** | Wh-possessive | `whose`\r\n**`WRB`** | Wh-adverb | `how` `where` \r\n**`EX`** | Expletive there | `there`\r\n**`CC`** | Coordinating conjugation | `\u0026` `and` `nor` `or`\r\n**`CD`** | Cardinal Numbers | `1` `7` `77` `one`\r\n**`LS`** | List item marker | `1` `B` `C` `One`\r\n**`UH`** | Interjection | `ah` `oh` `oops`\r\n**`FW`** | Foreign Words | `viva` `mon` `toujours`\r\n**`,`** | Comma | `,`\r\n**`:`** |Mid-sent punct | `:` `;` `...`\r\n**`.`** | Sent-final punct. | `.` `!` `?`\r\n**`(`** | Left parenthesis | `)` `}` `]`\r\n**`)`** | Right parenthesis | `(` `{` `[`\r\n**`#`** | Pound sign | `#`\r\n**`$`** | Currency symbols | `$` `€` `£` `¥`\r\n**`SYM`** | Other symbols | `+` `*` `/` `\u003c` `\u003e`\r\n**`EM`** | Emojis \u0026 emoticons | `:)` `❤`\r\n\r\n## Accuracy and performance\r\n\r\n#### TL:DR;\r\n\r\n- When smoothing is enabled: **96.43%** accuracy (processing 132K tokens in 38 seconds)\r\n- When smoothing is disabled: **94.4%** accuracy (processing 132K tokens in 3 seconds)\r\n\r\n----\r\n\r\nAs of 25 Jan 2017, this library scored **96.43%** at the [Penn Treebank](http://www.cis.upenn.edu/~treebank/) test (0.3% away from being a [state of the art tagger](https://www.aclweb.org/aclwiki/index.php?title=POS_Tagging_(State_of_the_art))).\r\n\r\nBeing written in JavaScript, I think it's safe to say that this is the most accurate JavaScript POS tagger, since the only JS library I know of is [pos-js](https://github.com/neopunisher/pos-js) which when I tested on the same treebank scored **87.8%**, though it was faster than my implementation when smoothing is enabled.\r\n\r\nHowever, if performance is what's you're after rather than accuracy, then you have the option to disable smoothing in this library and this will marginally increase performance making this library even faster than pos-js but with far better accuracy (**94.4%**).\r\n\r\n## Building from source and testing\r\n\r\n- Build: `tsc` (requires typescript)\r\n- Test: `node test/test.ts`\r\n\r\n## Credits\r\n* This project is an optimization and (almost complete) re-writing of [Compendium](https://github.com/Ulflander/compendium-js)'s POS tagger.\r\n* **Compendium**'s tagger itself was based on **[fasttag](https://github.com/mark-watson/fasttag_v2)**.\r\n* **Fasttag** is based on [Eric Brill's POS tagger](https://en.wikipedia.org/wiki/Brill_tagger).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFinNLP%2Fen-pos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FFinNLP%2Fen-pos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFinNLP%2Fen-pos/lists"}