Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gatenlp/gateplugin-tagger_np_chunking
A GATE based implementation of the Ramshaw and Marcaus BaseNP chunker
https://github.com/gatenlp/gateplugin-tagger_np_chunking
Last synced: about 2 months ago
JSON representation
A GATE based implementation of the Ramshaw and Marcaus BaseNP chunker
- Host: GitHub
- URL: https://github.com/gatenlp/gateplugin-tagger_np_chunking
- Owner: GateNLP
- License: lgpl-3.0
- Created: 2017-06-22T11:49:07.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2023-02-10T11:56:38.000Z (almost 2 years ago)
- Last Synced: 2024-04-16T07:59:26.524Z (9 months ago)
- Language: Java
- Homepage:
- Size: 49.8 KB
- Stars: 0
- Watchers: 14
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# NP Chunker
Copyright (C) 2003-2017 The University of Sheffield## DESCRIPTION
The application is a Java implementation of a the Ramshaw and Marcaus
BaseNP chunker (in fact the files in the resources directory are
taken straight from their original distribution) which attempts to
insert brackets marking noun phrases in text which has been marked
with POS tags in the same format as the output of Eric Brill's
transformational tagger. The output from this version should be
identical to the output of the oringinal C++/Perl version released
by Ramshaw and Marcus.A wrapper is also included which allows the easy use of this chunker
within the GATE framework (http://gate.ac.uk).For more information about baseNP structures and the use of
tranformation-based learning to derive them, see "Text Chunking
Using Transformation-Based Learning", Lance Ramshaw & Mitchell Marcus,
Proceedings of the Third ACL Workshop on Very Large Corpora, MIT,
June, 1995 (http://xxx.lanl.gov/e-print/cmp-lg/9505040)## DIFFERENCES FROM THE ORIGINAL
The major difference is the assumption is made that if a POS tag is
is not in the mapping file then it is tagged as 'I'. The original
version simply failed if an unknown POS tag was encountered.
When using the GATE wrapper the unknown chunk tag can be changed from
'I' to any other legal tag (B or O).