https://github.com/apache/opennlp-sandbox
Apache OpenNLP Sandbox
https://github.com/apache/opennlp-sandbox
apache compling languagetechnology nlp opennlp textprocessing
Last synced: about 2 months ago
JSON representation
Apache OpenNLP Sandbox
- Host: GitHub
- URL: https://github.com/apache/opennlp-sandbox
- Owner: apache
- License: apache-2.0
- Created: 2016-10-05T07:00:08.000Z (over 8 years ago)
- Default Branch: main
- Last Pushed: 2025-03-31T06:05:43.000Z (about 2 months ago)
- Last Synced: 2025-03-31T07:22:31.153Z (about 2 months ago)
- Topics: apache, compling, languagetechnology, nlp, opennlp, textprocessing
- Language: Java
- Homepage: https://opennlp.apache.org/
- Size: 33 MB
- Stars: 42
- Watchers: 17
- Forks: 32
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
Welcome to Apache OpenNLP!
===========[](https://github.com/apache/opennlp-sandbox/actions)
[](https://github.com/apache/opennlp-sandbox/graphs/contributors)
[](https://github.com/apache/opennlp-sandbox/pulls)
[](https://stackoverflow.com/questions/tagged/opennlp)The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.
This sandbox of the toolkit is written mostly in Java and provides support for special NLP tasks, such as
word sense disambiguation, coreference resolution, text summarization, and more!
These tasks are usually required to build text processing services.The goal of the OpenNLP sandbox is to provide extra components, potentially in an experimental stage.
OpenNLP sandbox code can be used both programmatically through its Java API, some components even from a terminal through its CLI.
## Useful Links
For additional information, visit the [OpenNLP Home Page](http://opennlp.apache.org/)
You can use OpenNLP with any language, demo models are provided [here](https://downloads.apache.org/opennlp/models/).
The models are fully compatible with the latest release, they can be used for testing or getting started.> [!NOTE]
> Please train your own models for all other use cases.Documentation, including JavaDocs, code usage and command-line interface examples are available [here](http://opennlp.apache.org/docs/)
You can also follow our [mailing lists](http://opennlp.apache.org/mailing-lists.html) for news and updates.
## Overview
Currently, the library has different components:
* `caseeditor-corpus-server-plugin`: A set of Java classes for [Apache UIMA](https://uima.apache.org) as Eclipse plugin to integrate corpora.
* `caseeditor-opennlp-plugin`: An OpenNLP plugin for [Apache UIMA](https://uima.apache.org).
* `corpus-server`: A multi-module component to create, search, remove, and serve multiple corpora.
* `mahout-addon`: An addon for [Apache Mahout](https://mahout.apache.org).
* `mallet-addon`: An addon for [Mallet](https://mimno.github.io/Mallet/topics.html) targeting topic modelling techniques.
* `modelbuilder-addon`: A set of classes to build models.
* `nlp-utils`: A set of OpenNLP util classes.
* `opennlp-coref`: A component to conduct co-reference resolution.
* `opennlp-dl`: An adapter component for [deeplearning4j](https://deeplearning4j.konduit.ai).
* `opennlp-grpc`: An implementation of a gRPC backend for OpenNLP.
* `opennlp-similarity`: A set of components that solve a number of text processing and search tasks, see further details in this [README.md](opennlp-similarity/README.md).
* `opennlp-wsd`: A set of components that allow for word sense disambiguation.
* `summarizer`: A set of classes providing text summarization.
* `tagging-server`: A RESTful webservice to allow for NER, POS tagging, sentence detection and tokenization.
* `tf-ner-poc`: An adapter component for [Tensorflow](https://www.tensorflow.org), in an early proof-of-concept (poc) stage.
* `wikinews-importer`: A set of classes to process and annotate text formatted in [MediaWiki markup](https://www.mediawiki.org/wiki/Help:Formatting).## Getting Started
You can import the core toolkit directly from Maven, SBT or Gradle after you have build it locally:
#### Maven
```
org.apache.opennlp
opennlp-sandbox
${opennlp.version}```
#### SBT
```
libraryDependencies += "org.apache.opennlp" % "opennlp-sandbox" % "${opennlp.version}"
```#### Gradle
```
compile group: "org.apache.opennlp", name: "opennlp-sandbox", version: "${opennlp.version}"
```For more details please check our [documentation](http://opennlp.apache.org/docs/)
## Building OpenNLP
At least JDK 21 and Maven 3.3.9 are required to build the sandbox components.
After cloning the repository go into the destination directory and run:
```
mvn install
```## Contributing
The Apache OpenNLP project is developed by volunteers and is always looking for new contributors to work on all parts of the project.
Every contribution is welcome and needed to make it better.
A contribution can be anything from a small documentation typo fix to a new component.If you would like to get involved please follow the instructions [here](https://github.com/apache/opennlp/blob/main/.github/CONTRIBUTING.md)