{"id":13785706,"url":"https://github.com/neomatrix369/nlp-java-jvm-example","last_synced_at":"2026-04-08T23:31:55.770Z","repository":{"id":146434670,"uuid":"221056352","full_name":"neomatrix369/nlp-java-jvm-example","owner":"neomatrix369","description":"A repo with NLP examples of libraries/packages/framework written in Java/JVM","archived":false,"fork":false,"pushed_at":"2019-12-02T17:25:15.000Z","size":220,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-02T16:29:54.509Z","etag":null,"topics":["bash","clojure","docker","graal","graalvm","java","jvm","kotlin","natural-language-processing","natural-language-understanding","nlp","scala","shell"],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/neomatrix369.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-11-11T19:40:17.000Z","updated_at":"2020-05-25T01:11:29.000Z","dependencies_parsed_at":"2024-01-18T19:11:30.367Z","dependency_job_id":null,"html_url":"https://github.com/neomatrix369/nlp-java-jvm-example","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/neomatrix369/nlp-java-jvm-example","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neomatrix369%2Fnlp-java-jvm-example","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neomatrix369%2Fnlp-java-jvm-example/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neomatrix369%2Fnlp-java-jvm-example/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neomatrix369%2Fnlp-java-jvm-example/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/neomatrix369","download_url":"https://codeload.github.com/neomatrix369/nlp-java-jvm-example/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neomatrix369%2Fnlp-java-jvm-example/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31578967,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T14:31:17.711Z","status":"ssl_error","status_checked_at":"2026-04-08T14:31:17.202Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bash","clojure","docker","graal","graalvm","java","jvm","kotlin","natural-language-processing","natural-language-understanding","nlp","scala","shell"],"created_at":"2024-08-03T19:01:03.631Z","updated_at":"2026-04-08T23:31:55.737Z","avatar_url":"https://github.com/neomatrix369.png","language":"Jupyter Notebook","funding_links":[],"categories":["Golang"],"sub_categories":["Examples"],"readme":"# NLP Java/JVM [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nNLP Java: [![NLP Java](https://img.shields.io/docker/pulls/neomatrix369/nlp-java.svg)](https://hub.docker.com/r/neomatrix369/nlp-java) | NLP Clojure: [![NLP Clojure](https://img.shields.io/docker/pulls/neomatrix369/nlp-clojure.svg)](https://hub.docker.com/r/neomatrix369/nlp-clojure) | NLP Kotlin: [![NLP Kotlin](https://img.shields.io/docker/pulls/neomatrix369/nlp-kotlin.svg)](https://hub.docker.com/r/neomatrix369/nlp-kotlin) | NLP Scala: [![NLP Scala](https://img.shields.io/docker/pulls/neomatrix369/nlp-scala.svg)](https://hub.docker.com/r/neomatrix369/nlp-scala)\n\n---\n\nRun a docker container with NLP libraries/frameworks written in Java/JVM languages, running under the traditional Java 11 (from OpenJDK or another source) or GraalVM.\n\nFind out more about [Natural Language Processing](https://en.wikipedia.org/wiki/Natural_language_processing) from the [NLP section](https://github.com/neomatrix369/awesome-ai-ml-dl/blob/master/natural-language-processing/README.md#natural-language-processing-nlp) section.\n\n## Goals\n\n- Run docker container containing NLP libraries/frameworks written in Java/JVM languages\n- Ability to create custom docker images (scripts \u0026 docs provided)\n- Ability to debug the docker container\n- Run using the traditional JDK 11 (OpenJDK or vendor specific versions)\n- Run using the polyglot JVM i.e. GraalVM JDK (Community version from Oracle Labs), when running performing operations from the CLI \n- Play with and learn from with some examples for each of the libraries provided\n\n## Libraries / frameworks provided\n\n### Java\n- [Standford CoreNLP](https://stanfordnlp.github.io/CoreNLP/)\n- [Apache OpenNLP](https://opennlp.apache.org/) | See **[README](./images/java/opennlp/README.md#apache-opennlp-) for usage and examples**\n- [NLP4J: NLP Toolkit for JVM Languages](https://emorynlp.github.io/nlp4j/)\n- [Word2vec in Java](https://deeplearning4j.org/docs/latest/deeplearning4j-nlp-word2vec)\n- [ReVerb: Web-Scale Open Information Extraction](https://github.com/knowitall/reverb/)\n- [OpenRegex: An efficient and flexible token-based regular expression language and engine](https://github.com/knowitall/openregex)\n- [CogcompNLP: Core libraries developed in the U of Illinois' Cognitive Computation Group](https://github.com/datquocnguyen/RDRPOSTagger)\n- [MALLET - MAchine Learning for LanguagE Toolkit](http://mallet.cs.umass.edu/)\n- [RDRPOSTagger - A robust POS tagging toolkit available (in both Java \u0026 Python) together with pre-trained models for 40+ languages.](https://github.com/datquocnguyen/RDRPOSTagger)\n\n### Clojure\n- [Clojure-openNLP](https://github.com/dakrone/clojure-opennlp) - Natural Language Processing in Clojure (opennlp)\n- [Infections-clj](https://github.com/r0man/inflections-clj) - Rails-like inflection library for Clojure and ClojureScript\n- [postagga](https://github.com/fekr/postagga) - A library to parse natural language in Clojure and ClojureScript\n\n### Kotlin\n- [Lingua](https://github.com/pemistahl/lingua/) - A language detection library for Kotlin and Java, suitable for long and short text alike\n- [Kotidgy](https://github.com/meiblorn/kotidgy) — an index-based text data generator written in Kotlin\n\n### Scala\n- [Saul](https://github.com/CogComp/saul) - Library for developing NLP systems, including built in modules like SRL, POS, etc.\n- [ATR4S](https://github.com/ispras/atr4s) - Toolkit with state-of-the-art automatic term recognition methods.\n- [tm](https://github.com/ispras/tm) - Implementation of topic modeling based on regularized multilingual PLSA.\n- [word2vec-scala](https://github.com/Refefer/word2vec-scala) - Scala interface to word2vec model; includes operations on vectors like word-distance and word-analogy.\n- [Epic](https://github.com/dlwh/epic) - Epic is a high performance statistical parser written in Scala, along with a framework for building complex structured prediction models.\n\n## Scripts provided\n\n**Scroll up to find the below provided scripts**\n\n- [docker-runner.sh](./docker-runner.sh): can perform a number of the below actions depending on the flags passed to it:\n    - runs the container and brings you to the command prompt inside the container:\n    - build the docker base and language (i.e. java, clojure, kotlin, scala) specific image takes under 5 minutes to finish on a decent connection \n    - push pre-built docker images to docker hub (please pass in your own Docker username and later on enter Docker login details, see usage below)\n    - a housekeeping script to remove dangling images and terminated containers (helps save some diskspace)\n- [Base Dockerfile](./images/base/Dockerfile) | [Java Dockerfile](./images/java/Dockerfile): Dockerfile scripts to help build the base and language (i.e. java, clojure, kotlin, scala) specific docker image of NLP Java/JVM in an isolated environment with the necessary dependencies.\n- [images folder](./images) - provided with scripts to build and the scripts included into the container for the base image and language (i.e. java, clojure, kotlin, scala) specific docker image\n\n## Usage\n\n**Help:**\n\n```bash\n$ ./docker-runner.sh --help\n\n       Usage: ./docker-runner.sh --dockerUserName [docker user name]\n                                 --language [language id]\n                                 --detach\n                                 --buildImage\n                                 --runContainer\n                                 --pushImageToHub\n                                 --cleanup\n                                 --help\n\n       --dockerUserName      docker user name as on Docker Hub\n                             (mandatory with build and push commands)\n       --language            language id as in java, clojure, scala, etc...\n       --detach              run container and detach from it,\n                             return control to console\n       --jdk                 name of the JDK to use (currently supports \n                             GRAALVM only, default is blank which \n                             enables the traditional JDK)\n       --javaopts            sets the JAVA_OPTS environment variable\n                             inside the container as it starts\n       --cleanup             (command action) remove exited containers and\n                             dangling images from the local repository\n       --buildImage          (command action) build the docker image\n       --runContainer        (command action) run the docker image as a docker container\n       --pushImageToHub      (command action) push the docker image built to Docker Hub\n       --help                shows the script usage help text\n```\n\n**Run the NLP Java/JVM docker container:**\n\n```bash\n$ ./docker-runner.sh --runContainer\n\nor\n\n$ ./docker-runner.sh --runContainer --dockerUserName [your docker user name]\n\nor run in GraalVM mode\n\n$ ./docker-runner.sh --runContainer --jdk \"GRAALVM\"\n\nor run by switching off JVMCI flag (default: on) when running in GRAALVM mode\n\n$ ./docker-runner.sh --javaopts \"-XX:-UseJVMCINativeLibrary\"\n```\n\n**Build the docker container:**\n\nEnsure your environment has the below variable set, or set it in your `.bashrc` or `.bash_profile` or the relevant startup script:\n\n```bash\nexport DOCKER_USER_NAME=\"your_docker_username\"\n```\n\nYou must have an account on Docker hub under the above user name.\n\n\n```bash\n$ ./docker-runner --buildImage\n\nor\n\n$ ./docker-runner --buildImage --dockerUserName \"your_docker_username\"\n\nor\n\n$ ./docker-runner --buildImage --language [language_id]\n```\n\n`[language_id]` - defaults to `java` when not provided. Accepts: `java`, `clojure`, `kotlin`, `scala`\n\n**Push built NLP Java/JVM docker image to Docker hub:**\n\n```bash\n$ ./docker-runner --pushImageToHub\n\nor\n\n$ ./docker-runner --pushImageToHub --dockerUserName \"your_docker_username\"\n```\n\nThe above will prompt the docker login name and password, before it can push your image to Docker hub (you must have an account on Docker hub).\n\n**Docker image on Docker Hub**\n\nFind the [NLP Java/JVM Docker Image on Docker Hub](https://hub.docker.com/r/neomatrix369/nlp-java). The `docker-runner.sh --pushImageToHub` script pushes the image to the Docker hub and the `docker-runner.sh --runContainer` script runs it from the local repository. If absent, in the the local repository, it downloads this image from Docker Hub.\n\n# Contributing\n\nContributions are very welcome, please share back with the wider community (and get credited for it)!\n\nPlease have a look at the [CONTRIBUTING](CONTRIBUTING.md) guidelines, also have a read about our [licensing](LICENSE.txt) policy.\n\n---\n\nGo to [NLP page](https://github.com/neomatrix369/awesome-ai-ml-dl/blob/master/natural-language-processing/README.md#natural-language-processing-nlp)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneomatrix369%2Fnlp-java-jvm-example","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fneomatrix369%2Fnlp-java-jvm-example","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneomatrix369%2Fnlp-java-jvm-example/lists"}