https://github.com/cltk/cltk_docker
Docker script for cltk
https://github.com/cltk/cltk_docker
Last synced: 12 months ago
JSON representation
Docker script for cltk
- Host: GitHub
- URL: https://github.com/cltk/cltk_docker
- Owner: cltk
- License: mit
- Created: 2016-03-08T19:43:17.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2019-09-19T01:47:52.000Z (almost 7 years ago)
- Last Synced: 2025-04-10T11:48:30.802Z (about 1 year ago)
- Language: Python
- Size: 15.6 KB
- Stars: 6
- Watchers: 36
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://travis-ci.org/cltk/cltk_docker)
# Docker for CLTK core software
This repository contains a Docker container for the [CLTK](http://cltk.org).
# Build
First, clone this repository:
``` bash
$ git clone https://github.com/cltk/cltk_docker.git
$ cd cltk_docker
```
Build the image:
```bash
$ docker build -t cltk .
```
# Running
To run the image:
```bash
$ docker run -it cltk
```
```python
>>> from cltk.corpus.utils.importer import CorpusImporter
>>> c = CorpusImporter('latin')
>>> c.list_corpora
['latin_text_perseus', 'latin_treebank_perseus', 'latin_text_lacus_curtius', 'latin_text_latin_library', 'phi5', 'phi7', 'latin_proper_names_cltk', 'latin_models_cltk', 'latin_pos_lemmata_cltk', 'latin_treebank_index_thomisticus', 'latin_lexica_perseus', 'latin_training_set_sentence_cltk', 'latin_word2vec_cltk', 'latin_text_antique_digiliblt', 'latin_text_corpus_grammaticorum_latinorum']
```
# Data Volumes
This `Dockerfile` uses three data volumes, which you can use to persist data across runs or map a directory from the Docker host:
* `/cltk_data`
* `/nltk_data`
* `/data`
So if you use e.g. `docker volume create cltk_data`, you can then use `docker run -ti -v cltk_data:/cltk_data ctlk`, and any corpora installed will persist when you use the same volume. If your Docker host has already installed corpora locally, you could instead use e.g. `docker run -ti -v $HOME/cltk_data:/cltk_data cltk`.
# Installing Corpora
This container also comes with a helper script, `install_corpora.py`, which can be used to install all corpora:
docker run -ti -v cltk_data:/cltk_data cltk install_corpora.py
Or corpora for specific languages:
docker run -ti -v cltk_data:/cltk_data cltk install_corpora.py greek latin
# Jupyter Notebook
The `Dockerfile.jupyter` file also defines a Jupyter Notebook container with CLTK installed. You can build it with `docker build -t cltk-jupyter -f Dockerfile.jupyter .`, and run it with e.g. (also using a mapped data volume as in the example above) `docker run -p 8888:8888 -v cltk_data:/cltk_data cltk-jupyter` (see the [Jupyter Docker Stacks Quick Start documentation](https://github.com/jupyter/docker-stacks#quick-start) for more examples)
# License
MIT. See LICENSE.txt.