{"id":19557374,"url":"https://github.com/andrefs/node-tnt-tagger","last_synced_at":"2025-08-06T16:33:25.716Z","repository":{"id":139106513,"uuid":"183111378","full_name":"andrefs/node-tnt-tagger","owner":"andrefs","description":"A statistical part-of-speech tagger","archived":false,"fork":false,"pushed_at":"2020-05-31T12:16:24.000Z","size":27,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-26T08:15:26.423Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andrefs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-23T23:38:35.000Z","updated_at":"2020-05-31T12:16:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"1ac5e4bc-e989-4b4a-86a0-d2a7f9dcbe2f","html_url":"https://github.com/andrefs/node-tnt-tagger","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/andrefs/node-tnt-tagger","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fnode-tnt-tagger","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fnode-tnt-tagger/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fnode-tnt-tagger/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fnode-tnt-tagger/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andrefs","download_url":"https://codeload.github.com/andrefs/node-tnt-tagger/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrefs%2Fnode-tnt-tagger/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269112468,"owners_count":24361978,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-06T02:00:09.910Z","response_time":99,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T04:42:06.268Z","updated_at":"2025-08-06T16:33:25.433Z","avatar_url":"https://github.com/andrefs.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# tnt-tagger\n\nA statistical part-of-speech tagger.\n\nThis is an implementation of Thorsten Brants' TnT parser. TnT, which\nstands for Trigrams'n'Tags, is _\"an efficient statistical\npart-of-speech tagger\"_, and its implementation is described in the\narticle [TnT -- A Statistical Part-of-Speech\nTagger](http://tagh.de/tom/wp-content/uploads/brants-2000.pdf).\n\nIn fact, **tnt-tagger** is a port of Python's [NLTK\nimplementation](https://www.nltk.org/_modules/nltk/tag/tnt.html) of\nsaid parser.\n\nThis is currently a work in progress. Future work includes refactoring\ncode to make it more Javascript-like (for now, it feels a bit\nartificial due to the direct translation from Python).\n\n## Installation\n\n```bash\n$ npm install tnt-tagger\n```\n## Usage\n\n```js\n\nconst TnT = require('./index');\nconst {Sentence,Token} = require('cetem-publico');\n\nconst ts = [new Sentence(1, [\n    new Token('Jersei', {pos: 'N'      }) ,\n    new Token('atinge', {pos: 'V'      }) ,\n    new Token('média',  {pos: 'N'      }) ,\n    new Token('de',     {pos: 'PREP'   }) ,\n    new Token('Cr$',    {pos: 'CUR'    }) ,\n    new Token('1,4',    {pos: 'NUM'    }) ,\n    new Token('milhão', {pos: 'N'      }) ,\n    new Token('em',     {pos: 'PREP|+' }) ,\n    new Token('a',      {pos: 'ART'    }) ,\n    new Token('venda',  {pos: 'N'      }) ,\n    new Token('de',     {pos: 'PREP|+' }) ,\n    new Token('a',      {pos: 'ART'    }) ,\n    new Token('Pinhal', {pos: 'NPROP'  }) ,\n    new Token('em',     {pos: 'PREP'   }) ,\n    new Token('São',    {pos: 'NPROP'  }) ,\n    new Token('Paulo',  {pos: 'NPROP'  })\n  ])];\n\nlet corpus = {\n  sentences:  function*(){\n    n = 1;\n\n    for(i=0; i\u003cn; i++){\n      yield ts[i];\n    }\n  }\n};\n\n\n\n\n\nlet s = new Sentence(1, [\n    new Token('Jersei'),\n    new Token('atinge'),\n    new Token('média' ),\n    new Token('de'    ),\n    new Token('Cr$'   ),\n    new Token('1,4'   ),\n    new Token('milhão'),\n    new Token('em'    ),\n    new Token('a'     ),\n    new Token('venda' ),\n    new Token('de'    ),\n    new Token('a'     ),\n    new Token('Pinhal'),\n    new Token('em'    ),\n    new Token('São'   ),\n    new Token('Paulo' ),\n  ]);\n\nlet t = new TnT();\n\nt.train(corpus)\n  .then(() =\u003e t.tag(s))\n  .then(console.log);\n\n```\n\n## Methods\n\n## TODO\n\n## Acknowledgements\n\nThanks to Thorsten Brants for the original version of this algorithm,\nand to NLTK's team for the implementation in which this module is\nbased on.\n\n## Bugs and stuff\n\nOpen a GitHub issue or, preferably, send me a pull request.\n\n## License\n\nMIT\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrefs%2Fnode-tnt-tagger","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrefs%2Fnode-tnt-tagger","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrefs%2Fnode-tnt-tagger/lists"}