https://github.com/davetang/learning_docker
Learning about Docker.
https://github.com/davetang/learning_docker
docker
Last synced: 5 months ago
JSON representation
Learning about Docker.
- Host: GitHub
- URL: https://github.com/davetang/learning_docker
- Owner: davetang
- License: mit
- Created: 2016-04-22T08:01:41.000Z (about 10 years ago)
- Default Branch: main
- Last Pushed: 2025-10-10T02:44:41.000Z (9 months ago)
- Last Synced: 2025-10-12T06:09:19.891Z (9 months ago)
- Topics: docker
- Language: Dockerfile
- Homepage: http://davetang.github.io/learning_docker/
- Size: 10.6 MB
- Stars: 61
- Watchers: 3
- Forks: 18
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Table of Contents
=================
* [Learning Docker](#learning-docker)
* [Introduction](#introduction)
* [Installing the Docker Engine](#installing-the-docker-engine)
* [Checking your installation](#checking-your-installation)
* [Docker information](#docker-information)
* [Basics](#basics)
* [Start containers automatically](#start-containers-automatically)
* [Dockerfile](#dockerfile)
* [ARG](#arg)
* [CMD](#cmd)
* [COPY](#copy)
* [ENTRYPOINT](#entrypoint)
* [Building an image](#building-an-image)
* [Renaming an image](#renaming-an-image)
* [Running an image](#running-an-image)
* [Setting environment variables](#setting-environment-variables)
* [Resource usage](#resource-usage)
* [Copying files between host and container](#copying-files-between-host-and-container)
* [Sharing between host and container](#sharing-between-host-and-container)
* [File permissions](#file-permissions)
* [File Permissions 2](#file-permissions-2)
* [Read only](#read-only)
* [Removing the image](#removing-the-image)
* [Committing changes](#committing-changes)
* [Access running container](#access-running-container)
* [Cleaning up exited containers](#cleaning-up-exited-containers)
* [Installing Perl modules](#installing-perl-modules)
* [Creating a data container](#creating-a-data-container)
* [R](#r)
* [Saving and transferring a Docker image](#saving-and-transferring-a-docker-image)
* [Sharing your image](#sharing-your-image)
* [Docker Hub](#docker-hub)
* [Quay.io](#quayio)
* [GitHub Actions](#github-actions)
* [Tips](#tips)
* [Useful links](#useful-links)
Sat Sep 20 02:57:44 UTC 2025
Learning Docker
================
## Introduction

Docker is an open source project that allows one to pack, ship, and run
any application as a lightweight container. An analogy of Docker
containers are shipping containers, which provide a standard and
consistent way of shipping just about anything. The container includes
everything that is needed for an application to run including the code,
system tools, and the necessary dependencies. If you wanted to test an
application, all you need to do is to download the Docker image and run
it in a new container. No more compiling and installing missing
dependencies!
The [overview](https://docs.docker.com/get-started/overview/) at
provides more information. For more a more
hands-on approach, check out know [Enough Docker to be
Dangerous](https://docs.docker.com/) and [this short
workshop](https://davetang.github.io/reproducible_bioinformatics/docker.html)
that I prepared for BioC Asia 2019.
This README was generated by GitHub Actions using the R Markdown file
`readme.Rmd`, which was executed via the `create_readme.sh` script.
## Installing the Docker Engine
To get started, you will need to install the Docker Engine; check out
[this guide](https://docs.docker.com/engine/install/).
## Checking your installation
To see if everything is working, try to obtain the Docker version.
``` bash
docker --version
```
## Docker version 28.0.4, build b8034c0
And run the `hello-world` image. (The `--rm` parameter is used to
automatically remove the container when it exits.)
``` bash
docker run --rm hello-world
```
## Unable to find image 'hello-world:latest' locally
## latest: Pulling from library/hello-world
## 17eec7bbc9d7: Pulling fs layer
## 17eec7bbc9d7: Download complete
## 17eec7bbc9d7: Pull complete
## Digest: sha256:54e66cc1dd1fcb1c3c58bd8017914dbed8701e2d8c74d9262e26bd9cc1642d31
## Status: Downloaded newer image for hello-world:latest
##
## Hello from Docker!
## This message shows that your installation appears to be working correctly.
##
## To generate this message, Docker took the following steps:
## 1. The Docker client contacted the Docker daemon.
## 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
## (amd64)
## 3. The Docker daemon created a new container from that image which runs the
## executable that produces the output you are currently reading.
## 4. The Docker daemon streamed that output to the Docker client, which sent it
## to your terminal.
##
## To try something more ambitious, you can run an Ubuntu container with:
## $ docker run -it ubuntu bash
##
## Share images, automate workflows, and more with a free Docker ID:
## https://hub.docker.com/
##
## For more examples and ideas, visit:
## https://docs.docker.com/get-started/
## Docker information
Get more version information.
``` bash
docker version
```
## Client: Docker Engine - Community
## Version: 28.0.4
## API version: 1.48
## Go version: go1.23.7
## Git commit: b8034c0
## Built: Tue Mar 25 15:07:16 2025
## OS/Arch: linux/amd64
## Context: default
##
## Server: Docker Engine - Community
## Engine:
## Version: 28.0.4
## API version: 1.48 (minimum version 1.24)
## Go version: go1.23.7
## Git commit: 6430e49
## Built: Tue Mar 25 15:07:16 2025
## OS/Arch: linux/amd64
## Experimental: false
## containerd:
## Version: 1.7.27
## GitCommit: 05044ec0a9a75232cad458027ca83437aae3f4da
## runc:
## Version: 1.2.5
## GitCommit: v1.2.5-0-g59923ef
## docker-init:
## Version: 0.19.0
## GitCommit: de40ad0
Even more information.
``` bash
docker info
```
## Client: Docker Engine - Community
## Version: 28.0.4
## Context: default
## Debug Mode: false
## Plugins:
## buildx: Docker Buildx (Docker Inc.)
## Version: v0.28.0
## Path: /usr/libexec/docker/cli-plugins/docker-buildx
## compose: Docker Compose (Docker Inc.)
## Version: v2.38.2
## Path: /usr/libexec/docker/cli-plugins/docker-compose
##
## Server:
## Containers: 0
## Running: 0
## Paused: 0
## Stopped: 0
## Images: 1
## Server Version: 28.0.4
## Storage Driver: overlay2
## Backing Filesystem: extfs
## Supports d_type: true
## Using metacopy: false
## Native Overlay Diff: false
## userxattr: false
## Logging Driver: json-file
## Cgroup Driver: systemd
## Cgroup Version: 2
## Plugins:
## Volume: local
## Network: bridge host ipvlan macvlan null overlay
## Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
## Swarm: inactive
## Runtimes: io.containerd.runc.v2 runc
## Default Runtime: runc
## Init Binary: docker-init
## containerd version: 05044ec0a9a75232cad458027ca83437aae3f4da
## runc version: v1.2.5-0-g59923ef
## init version: de40ad0
## Security Options:
## apparmor
## seccomp
## Profile: builtin
## cgroupns
## Kernel Version: 6.11.0-1018-azure
## Operating System: Ubuntu 24.04.3 LTS
## OSType: linux
## Architecture: x86_64
## CPUs: 4
## Total Memory: 15.62GiB
## Name: runnervmf4ws1
## ID: 80b0c887-f378-498c-94b8-5e260688697c
## Docker Root Dir: /var/lib/docker
## Debug Mode: false
## Username: githubactions
## Experimental: false
## Insecure Registries:
## ::1/128
## 127.0.0.0/8
## Live Restore Enabled: false
## Basics
The two guides linked in the introduction section provide some
information on the basic commands but I’ll include some here as well.
One of the main reasons I use Docker is for building tools. For this
purpose, I use Docker like a virtual machine, where I can install
whatever I want. This is important because I can do my testing in an
isolated environment and not worry about affecting the main server. I
like to use Ubuntu because it’s a popular Linux distribution and
therefore whenever I run into a problem, chances are higher that someone
else has had the same problem, asked a question on a forum, and received
a solution.
Before we can run Ubuntu using Docker, we need an image. We can obtain
an Ubuntu image from the [official Ubuntu image
repository](https://hub.docker.com/_/ubuntu/) from Docker Hub by running
`docker pull`.
``` bash
docker pull ubuntu:18.04
```
## 18.04: Pulling from library/ubuntu
## 7c457f213c76: Pulling fs layer
## 7c457f213c76: Verifying Checksum
## 7c457f213c76: Download complete
## 7c457f213c76: Pull complete
## Digest: sha256:152dc042452c496007f07ca9127571cb9c29697f42acbfad72324b2bb2e43c98
## Status: Downloaded newer image for ubuntu:18.04
## docker.io/library/ubuntu:18.04
To run Ubuntu using Docker, we use `docker run`.
``` bash
docker run --rm ubuntu:18.04 cat /etc/os-release
```
## NAME="Ubuntu"
## VERSION="18.04.6 LTS (Bionic Beaver)"
## ID=ubuntu
## ID_LIKE=debian
## PRETTY_NAME="Ubuntu 18.04.6 LTS"
## VERSION_ID="18.04"
## HOME_URL="https://www.ubuntu.com/"
## SUPPORT_URL="https://help.ubuntu.com/"
## BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
## PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
## VERSION_CODENAME=bionic
## UBUNTU_CODENAME=bionic
You can work interactively with the Ubuntu image by specifying the `-it`
option.
``` console
docker run --rm -it ubuntu:18:04 /bin/bash
```
You may have noticed that I keep using the `--rm` option, which removes
the container once you quit. If you don’t use this option, the container
is saved up until the point that you exit; all changes you made, files
you created, etc. are saved. Why am I deleting all my changes? Because
there is a better (and more reproducible) way to make changes to the
system and that is by using a Dockerfile.
## Start containers automatically
When hosting a service using Docker (such as running [RStudio
Server](https://davetang.org/muse/2021/04/24/running-rstudio-server-with-docker/https://davetang.org/muse/2021/04/24/running-rstudio-server-with-docker/)),
it would be nice if the container automatically starts up again when the
server (and Docker) restarts. If you use `--restart flag` with
`docker run`, Docker will [restart your
container](https://docs.docker.com/config/containers/start-containers-automatically/)
when your container has exited or when Docker restarts. The value of the
`--restart` flag can be the following:
- `no` - do not automatically restart (default)
- `on-failure[:max-retries]` - restarts if it exits due to an error
(non-zero exit code) and the number of attempts is limited using the
`max-retries` option
- `always` - always restarts the container; if it is manually stopped,
it is restarted only when the Docker daemon restarts (or when the
container is manually restarted)
- `unless-stopped` - similar to `always` but when the container is
stopped, it is not restarted even after the Docker daemon restarts.
``` console
docker run -d \
--restart always \
-p 8888:8787 \
-e PASSWORD=password \
-e USERID=$(id -u) \
-e GROUPID=$(id -g) \
rocker/rstudio:4.1.2
```
## Dockerfile
A Dockerfile is a text file that contains instructions for building
Docker images. A Dockerfile adheres to a specific format and set of
instructions, which you can find at [Dockerfile
reference](https://docs.docker.com/engine/reference/builder/). There is
also a [Best practices
guide](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
for writing Dockerfiles.
A Docker image is made up of different layers and they act like
snapshots. Each layer, or intermediate image, is created each time an
instruction in the Dockerfile is executed. Each layer is assigned a
unique hash and are cached by default. This means that you do not need
to rebuild a layer again from scratch if it has not changed. Keep this
in mind when creating a Dockerfile.
Some commonly used instructions include:
- `FROM` - Specifies the parent or base image to use for building an
image and must be the first command in the file.
- `COPY` - Copies files from the current directory (of where the
Dockerfile is) to the image filesystem.
- `RUN` - Executes a command inside the image.
- `ADD` - Adds new files or directories from a source or URL to the
image filesystem.
- `ENTRYPOINT` - Makes the container run like an executable.
- `CMD` - The default command or parameter/s for the container and can
be used with `ENTRYPOINT`.
- `WORKDIR` - Sets the working directory for the image. Any `CMD`,
`RUN`, `COPY`, or `ENTRYPOINT` instruction after the `WORKDIR`
declaration will be executed in the context of the working directory.
- `USER` - Changes the user
I have an example Dockerfile that uses the Ubuntu 18.04 image to build
[BWA](https://github.com/lh3/bwa), a popular short read alignment tool
used in bioinformatics.
``` bash
cat Dockerfile
```
## FROM ubuntu:18.04
##
## MAINTAINER Dave Tang
##
## LABEL source="https://github.com/davetang/learning_docker/blob/main/Dockerfile"
##
## RUN apt-get clean all && \
## apt-get update && \
## apt-get upgrade -y && \
## apt-get install -y \
## build-essential \
## wget \
## zlib1g-dev && \
## apt-get clean all && \
## apt-get purge && \
## rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
##
## RUN mkdir /src && \
## cd /src && \
## wget https://github.com/lh3/bwa/releases/download/v0.7.17/bwa-0.7.17.tar.bz2 && \
## tar xjf bwa-0.7.17.tar.bz2 && \
## cd bwa-0.7.17 && \
## make && \
## mv bwa /usr/local/bin && \
## cd && rm -rf /src
##
## WORKDIR /work
##
## CMD ["bwa"]
### ARG
To define variables in your Dockerfile use `ARG name=value`. For
example, you can use `ARG` to create a new variable that stores a
version number of a program. When a new version of the program is
released, you can simply change the `ARG` and re-build your Dockerfile.
ARG star_ver=2.7.10a
RUN cd /usr/src && \
wget https://github.com/alexdobin/STAR/archive/refs/tags/${star_ver}.tar.gz && \
tar xzf ${star_ver}.tar.gz && \
rm ${star_ver}.tar.gz && \
cd STAR-${star_ver}/source && \
make STAR && \
cd /usr/local/bin && \
ln -s /usr/src/STAR-${star_ver}/source/STAR .
### CMD
The [CMD](https://docs.docker.com/engine/reference/builder/#cmd)
instruction in a Dockerfile does not execute anything at build time but
specifies the intended command for the image; there can only be one CMD
instruction in a Dockerfile and if you list more than one CMD then only
the last CMD will take effect. The main purpose of a CMD is to provide
defaults for an executing container.
### COPY
The [COPY](https://docs.docker.com/engine/reference/builder/#copy)
instruction copies new files or directories from `` and adds them
to the filesystem of the container at the path ``. It has two
forms:
COPY [--chown=:] [--chmod=] ...
COPY [--chown=:] [--chmod=] ["",... ""]
Note the `--chown` parameter, which can be used to set the ownership of
the copied files/directories. If this is not specified, the default
ownership is `root`, which can be a problem.
For example in the RStudio Server
[Dockerfile](https://github.com/davetang/learning_docker/blob/main/rstudio/Dockerfile),
there are two `COPY` instructions that set the ownership to the
`rstudio` user.
COPY --chown=rstudio:rstudio rstudio/rstudio-prefs.json /home/rstudio/.config/rstudio
COPY --chown=rstudio:rstudio rstudio/.Rprofile /home/rstudio/
The two files that are copied are config files and therefore need to be
writable by `rstudio` if settings are changed in RStudio Server.
Usually the root path of `` is set to the directory where the
Dockerfile exists. The example above is different because the RStudio
Server image is built by GitHub Actions, and the root path of `` is
the GitHub repository.
### ENTRYPOINT
An
[ENTRYPOINT](https://docs.docker.com/engine/reference/builder/#entrypoint)
allows you to configure a container that will run as an executable.
ENTRYPOINT has two forms:
- ENTRYPOINT \[“executable”, “param1”, “param2”\] (exec form, preferred)
- ENTRYPOINT command param1 param2 (shell form)
``` console
FROM ubuntu
ENTRYPOINT ["top", "-b"]
CMD ["-c"]
```
Use `--entrypoint` to override ENTRYPOINT instruction.
``` console
docker run --entrypoint
```
## Building an image
Use the `build` subcommand to build Docker images and use the `-f`
parameter if your Dockerfile is named as something else otherwise Docker
will look for a file named `Dockerfile`. The period at the end, tells
Docker to look in the current directory.
``` bash
cat build.sh
```
## #!/usr/bin/env bash
##
## set -euo pipefail
##
## ver=0.7.17
##
## docker build -t davetang/bwa:${ver} .
You can push the built image to [Docker Hub](https://hub.docker.com/) if
you have an account. I have used my Docker Hub account name to name my
Docker image.
``` console
# use -f to specify the Dockerfile to use
# the period indicates that the Dockerfile is in the current directory
docker build -f Dockerfile.base -t davetang/base .
# log into Docker Hub
docker login
# push to Docker Hub
docker push davetang/base
```
## Renaming an image
The `docker image tag` command will create a new tag, i.e. new image
name, that refers to an old image. It is not quite renaming but can be
considered renaming since you will have a new name for your image.
The usage is:
Usage: docker image tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]
For example I have created a new tag for my RStudio Server image, so
that I can easily push it to Quay.io.
``` console
docker image tag davetang/rstudio:4.2.2 quay.io/davetang31/rstudio:4.2.2
```
The original image `davetang/rstudio:4.2.2` still exists, which is why
tagging is not quite renaming.
## Running an image
[Docker run
documentation](https://docs.docker.com/engine/reference/run/).
``` bash
docker run --rm davetang/bwa:0.7.17
```
## Unable to find image 'davetang/bwa:0.7.17' locally
## 0.7.17: Pulling from davetang/bwa
## feac53061382: Pulling fs layer
## 549f86662946: Pulling fs layer
## 5f22362f8660: Pulling fs layer
## 3836f06c7ac7: Pulling fs layer
## 3836f06c7ac7: Waiting
## feac53061382: Verifying Checksum
## feac53061382: Download complete
## 5f22362f8660: Verifying Checksum
## 5f22362f8660: Download complete
## 3836f06c7ac7: Download complete
## feac53061382: Pull complete
## 549f86662946: Verifying Checksum
## 549f86662946: Download complete
## 549f86662946: Pull complete
## 5f22362f8660: Pull complete
## 3836f06c7ac7: Pull complete
## Digest: sha256:f0da4e206f549ed8c08f5558b111cb45677c4de6a3dc0f2f0569c648e8b27fc5
## Status: Downloaded newer image for davetang/bwa:0.7.17
##
## Program: bwa (alignment via Burrows-Wheeler transformation)
## Version: 0.7.17-r1188
## Contact: Heng Li
##
## Usage: bwa [options]
##
## Command: index index sequences in the FASTA format
## mem BWA-MEM algorithm
## fastmap identify super-maximal exact matches
## pemerge merge overlapping paired ends (EXPERIMENTAL)
## aln gapped/ungapped alignment
## samse generate alignment (single ended)
## sampe generate alignment (paired ended)
## bwasw BWA-SW for long queries
##
## shm manage indices in shared memory
## fa2pac convert FASTA to PAC format
## pac2bwt generate BWT from PAC
## pac2bwtgen alternative algorithm for generating BWT
## bwtupdate update .bwt to the new format
## bwt2sa generate SA from BWT and Occ
##
## Note: To use BWA, you need to first index the genome with `bwa index'.
## There are three alignment algorithms in BWA: `mem', `bwasw', and
## `aln/samse/sampe'. If you are not sure which to use, try `bwa mem'
## first. Please `man ./bwa.1' for the manual.
## Setting environment variables
Create a new environment variable (ENV) using `--env`.
``` bash
docker run --rm --env YEAR=1984 busybox env
```
## Unable to find image 'busybox:latest' locally
## latest: Pulling from library/busybox
## 80bfbb8a41a2: Pulling fs layer
## 80bfbb8a41a2: Download complete
## 80bfbb8a41a2: Pull complete
## Digest: sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e
## Status: Downloaded newer image for busybox:latest
## PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
## HOSTNAME=b103f016ad65
## YEAR=1984
## HOME=/root
Two ENVs.
``` bash
docker run --rm --env YEAR=1984 --env SEED=2049 busybox env
```
## PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
## HOSTNAME=b86b07e2422d
## YEAR=1984
## SEED=2049
## HOME=/root
Or `-e` for less typing.
``` bash
docker run --rm -e YEAR=1984 -e SEED=2049 busybox env
```
## PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
## HOSTNAME=5aa9f04852ef
## YEAR=1984
## SEED=2049
## HOME=/root
## Resource usage
To
[restrict](https://docs.docker.com/config/containers/resource_constraints/)
CPU usage use `--cpus=n` and use `--memory=` to restrict the maximum
amount of memory the container can use.
We can confirm the limited CPU usage by running an endless while loop
and using `docker stats` to confirm the CPU usage. *Remember to use
`docker stop` to stop the container after confirming the usage!*
Restrict to 1 CPU.
``` console
# run in detached mode
docker run --rm -d --cpus=1 davetang/bwa:0.7.17 perl -le 'while(1){ }'
# check stats and use control+c to exit
docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
8cc20bcfa4f4 vigorous_khorana 100.59% 572KiB / 1.941GiB 0.03% 736B / 0B 0B / 0B 1
docker stop 8cc20bcfa4f4
```
Restrict to 1/2 CPU.
``` console
# run in detached mode
docker run --rm -d --cpus=0.5 davetang/bwa:0.7.17 perl -le 'while(1){ }'
# check stats and use control+c to exit
docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
af6e812a94da unruffled_liskov 50.49% 584KiB / 1.941GiB 0.03% 736B / 0B 0B / 0B 1
docker stop af6e812a94da
```
## Copying files between host and container
Use `docker cp` but I recommend mounting a volume to a Docker container
(see next section).
``` console
docker cp --help
Usage: docker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH|-
docker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH
Copy files/folders between a container and the local filesystem
Options:
-L, --follow-link Always follow symbol link in SRC_PATH
--help Print usage
# find container name
docker ps -a
# create file to transfer
echo hi > hi.txt
docker cp hi.txt fee424ef6bf0:/root/
# start container
docker start -ai fee424ef6bf0
# inside container
cat /root/hi.txt
hi
# create file inside container
echo bye > /root/bye.txt
exit
# transfer file from container to host
docker cp fee424ef6bf0:/root/bye.txt .
cat bye.txt
bye
```
## Sharing between host and container
Use the `-v` flag to mount directories to a container so that you can
share files between the host and container.
In the example below, I am mounting `data` from the current directory
(using the Unix command `pwd`) to `/work` in the container. I am working
from the root directory of this GitHub repository, which contains the
`data` directory.
``` bash
ls data
```
## README.md
## chrI.fa.gz
Any output written to `/work` inside the container, will be accessible
inside `data` on the host. The command below will create BWA index files
for `data/chrI.fa.gz`.
``` bash
docker run --rm -v $(pwd)/data:/work davetang/bwa:0.7.17 bwa index chrI.fa.gz
```
## [bwa_index] Pack FASTA... 0.14 sec
## [bwa_index] Construct BWT for the packed sequence...
## [bwa_index] 3.25 seconds elapse.
## [bwa_index] Update BWT... 0.06 sec
## [bwa_index] Pack forward-only FASTA... 0.11 sec
## [bwa_index] Construct SA from BWT and Occ... 0.95 sec
## [main] Version: 0.7.17-r1188
## [main] CMD: bwa index chrI.fa.gz
## [main] Real time: 4.530 sec; CPU: 4.538 sec
We can see the newly created index files.
``` bash
ls -lrt data
```
## total 30436
## -rw-r--r-- 1 runner runner 194 Sep 20 02:51 README.md
## -rw-r--r-- 1 runner runner 4772981 Sep 20 02:51 chrI.fa.gz
## -rw-r--r-- 1 root root 15072516 Sep 20 02:57 chrI.fa.gz.bwt
## -rw-r--r-- 1 root root 3768110 Sep 20 02:57 chrI.fa.gz.pac
## -rw-r--r-- 1 root root 41 Sep 20 02:57 chrI.fa.gz.ann
## -rw-r--r-- 1 root root 13 Sep 20 02:57 chrI.fa.gz.amb
## -rw-r--r-- 1 root root 7536272 Sep 20 02:57 chrI.fa.gz.sa
However note that the generated files are owned by `root`, which is
slightly annoying because unless we have root access, we need to start a
Docker container with the volume re-mounted to alter/delete the files.
### File permissions
As seen above, files generated inside the container on a mounted volume
are owned by `root`. This is because the default user inside a Docker
container is `root`. In Linux, there is typically a `root` user with the
UID and GID of 0; this user exists in the host Linux environment (where
the Docker engine is running) as well as inside the Docker container.
In the example below, the mounted volume is owned by UID 1211 and GID
1211 (in the host environment). This UID and GID does not exist in the
Docker container, thus the UID and GID are shown instead of a name like
`root`. This is important to understand because to circumvent this file
permission issue, we need to create a user that matches the UID and GID
in the host environment.
``` console
ls -lrt
# total 2816
# -rw-r--r-- 1 1211 1211 1000015 Apr 27 02:00 ref.fa
# -rw-r--r-- 1 1211 1211 21478 Apr 27 02:00 l100_n100_d400_31_2.fq
# -rw-r--r-- 1 1211 1211 21478 Apr 27 02:00 l100_n100_d400_31_1.fq
# -rw-r--r-- 1 1211 1211 119 Apr 27 02:01 run.sh
# -rw-r--r-- 1 root root 1000072 Apr 27 02:03 ref.fa.bwt
# -rw-r--r-- 1 root root 250002 Apr 27 02:03 ref.fa.pac
# -rw-r--r-- 1 root root 40 Apr 27 02:03 ref.fa.ann
# -rw-r--r-- 1 root root 12 Apr 27 02:03 ref.fa.amb
# -rw-r--r-- 1 root root 500056 Apr 27 02:03 ref.fa.sa
# -rw-r--r-- 1 root root 56824 Apr 27 02:04 aln.sam
```
As mentioned already, having `root` ownership is problematic because
when we are back in the host environment, we can’t modify these files.
To circumvent this, we can create a user that matches the host user by
passing three environmental variables from the host to the container.
``` console
docker run -it \
-v ~/my_data:/data \
-e MYUID=$(id -u) \
-e MYGID=$(id -g) \
-e ME=$(whoami) \
bwa /bin/bash
```
We use the environment variables and the following steps to create an
identical user inside the container.
``` console
adduser --quiet --home /home/san/$ME --no-create-home --gecos "" --shell /bin/bash --disabled-password $ME
# optional: give yourself admin privileges
echo "%$ME ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
# update the IDs to those passed into Docker via environment variable
sed -i -e "s/1000:1000/$MYUID:$MYGID/g" /etc/passwd
sed -i -e "s/$ME:x:1000/$ME:x:$MYGID/" /etc/group
# su - as the user
exec su - $ME
# run BWA again, after you have deleted the old files as root
bwa index ref.fa
bwa mem ref.fa l100_n100_d400_31_1.fq l100_n100_d400_31_2.fq > aln.sam
# check output
ls -lrt
# total 2816
# -rw-r--r-- 1 dtang dtang 1000015 Apr 27 02:00 ref.fa
# -rw-r--r-- 1 dtang dtang 21478 Apr 27 02:00 l100_n100_d400_31_2.fq
# -rw-r--r-- 1 dtang dtang 21478 Apr 27 02:00 l100_n100_d400_31_1.fq
# -rw-r--r-- 1 dtang dtang 119 Apr 27 02:01 run.sh
# -rw-rw-r-- 1 dtang dtang 1000072 Apr 27 02:12 ref.fa.bwt
# -rw-rw-r-- 1 dtang dtang 250002 Apr 27 02:12 ref.fa.pac
# -rw-rw-r-- 1 dtang dtang 40 Apr 27 02:12 ref.fa.ann
# -rw-rw-r-- 1 dtang dtang 12 Apr 27 02:12 ref.fa.amb
# -rw-rw-r-- 1 dtang dtang 500056 Apr 27 02:12 ref.fa.sa
# -rw-rw-r-- 1 dtang dtang 56824 Apr 27 02:12 aln.sam
# exit container
exit
```
This time when you check the file permissions in the host environment,
they should match your username.
``` console
ls -lrt ~/my_data
# total 2816
# -rw-r--r-- 1 dtang dtang 1000015 Apr 27 10:00 ref.fa
# -rw-r--r-- 1 dtang dtang 21478 Apr 27 10:00 l100_n100_d400_31_2.fq
# -rw-r--r-- 1 dtang dtang 21478 Apr 27 10:00 l100_n100_d400_31_1.fq
# -rw-r--r-- 1 dtang dtang 119 Apr 27 10:01 run.sh
# -rw-rw-r-- 1 dtang dtang 1000072 Apr 27 10:12 ref.fa.bwt
# -rw-rw-r-- 1 dtang dtang 250002 Apr 27 10:12 ref.fa.pac
# -rw-rw-r-- 1 dtang dtang 40 Apr 27 10:12 ref.fa.ann
# -rw-rw-r-- 1 dtang dtang 12 Apr 27 10:12 ref.fa.amb
# -rw-rw-r-- 1 dtang dtang 500056 Apr 27 10:12 ref.fa.sa
# -rw-rw-r-- 1 dtang dtang 56824 Apr 27 10:12 aln.sam
```
### File Permissions 2
There is a `-u` or `--user` parameter that can be used with `docker run`
to run a container using a specific user. This is easier than creating a
new user.
In this example we run the `touch` command as `root`.
``` bash
docker run -v $(pwd):/$(pwd) ubuntu:22.10 touch $(pwd)/test_root.txt
ls -lrt $(pwd)/test_root.txt
```
## Unable to find image 'ubuntu:22.10' locally
## 22.10: Pulling from library/ubuntu
## 3ad6ea492c35: Pulling fs layer
## 3ad6ea492c35: Verifying Checksum
## 3ad6ea492c35: Download complete
## 3ad6ea492c35: Pull complete
## Digest: sha256:e322f4808315c387868a9135beeb11435b5b83130a8599fd7d0014452c34f489
## Status: Downloaded newer image for ubuntu:22.10
## -rw-r--r-- 1 root root 0 Sep 20 02:57 /home/runner/work/learning_docker/learning_docker/test_root.txt
In this example, we run the command as a user with the same UID and GID;
the `stat` command is used to get the UID and GID.
``` bash
docker run -v $(pwd):/$(pwd) -u $(stat -c "%u:%g" $HOME) ubuntu:22.10 touch $(pwd)/test_mine.txt
ls -lrt $(pwd)/test_mine.txt
```
## -rw-r--r-- 1 runner runner 0 Sep 20 02:57 /home/runner/work/learning_docker/learning_docker/test_mine.txt
One issue with this method is that you may encounter the following
warning (if running interactively):
groups: cannot find name for group ID 1000
I have no name!@ed9e8b6b7622:/$
This is because the user in your host environment does not exist in the
container environment. As far as I am aware, this is not a problem; we
just want to create files/directories with matching user and group IDs.
### Read only
To mount a volume but with read-only permissions, append `:ro` at the
end.
``` bash
docker run --rm -v $(pwd):/work:ro davetang/bwa:0.7.17 touch test.txt
```
## touch: cannot touch 'test.txt': Read-only file system
## Removing the image
Use `docker rmi` to remove an image. You will need to remove any stopped
containers first before you can remove an image. Use `docker ps -a` to
find stopped containers and `docker rm` to remove these containers.
Let’s pull the `busybox` image.
``` bash
docker pull busybox
```
## Using default tag: latest
## latest: Pulling from library/busybox
## Digest: sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e
## Status: Image is up to date for busybox:latest
## docker.io/library/busybox:latest
Check out `busybox`.
``` bash
docker images busybox
```
## REPOSITORY TAG IMAGE ID CREATED SIZE
## busybox latest 0ed463b26dae 11 months ago 4.43MB
Remove `busybox`.
``` bash
docker rmi busybox
```
## Untagged: busybox:latest
## Untagged: busybox@sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e
## Deleted: sha256:0ed463b26daee791b094dc3fff25edb3e79f153d37d274e5c2936923c38dac2b
## Deleted: sha256:80e840de630d08a6a1e0ee30e7c8378cf1ed6a424315d7e437f54780aee6bf5a
## Committing changes
Generally, it is better to use a Dockerfile to manage your images in a
documented and maintainable way but if you still want to [commit
changes](https://docs.docker.com/engine/reference/commandline/commit/)
to your container (like you would for Git), read on.
When you log out of a container, the changes made are still stored; type
`docker ps -a` to see all containers and the latest changes. Use
`docker commit` to commit your changes.
``` console
docker ps -a
# git style commit
# -a, --author= Author (e.g., "John Hannibal Smith ")
# -m, --message= Commit message
docker commit -m 'Made change to blah' -a 'Dave Tang'
# use docker history to check history
docker history
```
## Access running container
To access a container that is already running, perhaps in the background
(using detached mode: `docker run` with `-d`) use `docker ps` to find
the name of the container and then use `docker exec`.
In the example below, my container name is `rstudio_dtang`.
``` console
docker exec -it rstudio_dtang /bin/bash
```
## Cleaning up exited containers
I typically use the `--rm` flag with `docker run` so that containers are
automatically removed after I exit them. However, if you don’t use
`--rm`, by default a container’s file system persists even after the
container exits. For example:
``` bash
docker run hello-world
```
##
## Hello from Docker!
## This message shows that your installation appears to be working correctly.
##
## To generate this message, Docker took the following steps:
## 1. The Docker client contacted the Docker daemon.
## 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
## (amd64)
## 3. The Docker daemon created a new container from that image which runs the
## executable that produces the output you are currently reading.
## 4. The Docker daemon streamed that output to the Docker client, which sent it
## to your terminal.
##
## To try something more ambitious, you can run an Ubuntu container with:
## $ docker run -it ubuntu bash
##
## Share images, automate workflows, and more with a free Docker ID:
## https://hub.docker.com/
##
## For more examples and ideas, visit:
## https://docs.docker.com/get-started/
Show all containers.
``` bash
docker ps -a
```
## CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
## ad67e22d1167 hello-world "/hello" Less than a second ago Exited (0) Less than a second ago vigilant_panini
## 65f0704bff8e ubuntu:22.10 "touch /home/runner/…" 2 seconds ago Exited (0) 1 second ago naughty_bhaskara
## 25d371e800b0 ubuntu:22.10 "touch /home/runner/…" 2 seconds ago Exited (0) 1 second ago gracious_boyd
We can use a sub-shell to get all (`-a`) container IDs (`-q`) that have
exited (`-f status=exited`) and then remove them (`docker rm -v`).
``` bash
docker rm -v $(docker ps -a -q -f status=exited)
```
## ad67e22d1167
## 65f0704bff8e
## 25d371e800b0
Check to see if the container still exists.
``` bash
docker ps -a
```
## CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
We can set this up as a Bash script so that we can easily remove exited
containers. In the Bash script `-z` returns true if `$exited` is empty,
i.e. no exited containers, so we will only run the command when
`$exited` is not true.
``` bash
cat clean_up_docker.sh
```
## #!/usr/bin/env bash
##
## set -euo pipefail
##
## exited=`docker ps -a -q -f status=exited`
##
## if [[ ! -z ${exited} ]]; then
## docker rm -v $(docker ps -a -q -f status=exited)
## fi
##
## exit 0
As I have mentioned, you can use the
[–rm](https://docs.docker.com/engine/reference/run/#clean-up---rm)
parameter to automatically clean up the container and remove the file
system when the container exits.
``` bash
docker run --rm hello-world
```
##
## Hello from Docker!
## This message shows that your installation appears to be working correctly.
##
## To generate this message, Docker took the following steps:
## 1. The Docker client contacted the Docker daemon.
## 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
## (amd64)
## 3. The Docker daemon created a new container from that image which runs the
## executable that produces the output you are currently reading.
## 4. The Docker daemon streamed that output to the Docker client, which sent it
## to your terminal.
##
## To try something more ambitious, you can run an Ubuntu container with:
## $ docker run -it ubuntu bash
##
## Share images, automate workflows, and more with a free Docker ID:
## https://hub.docker.com/
##
## For more examples and ideas, visit:
## https://docs.docker.com/get-started/
No containers.
``` bash
docker ps -a
```
## CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
## Installing Perl modules
Use `cpanminus`.
``` console
apt-get install -y cpanminus
# install some Perl modules
cpanm Archive::Extract Archive::Zip DBD::mysql
```
## Creating a data container
This [guide on working with Docker data
volumes](https://www.digitalocean.com/community/tutorials/how-to-work-with-docker-data-volumes-on-ubuntu-14-04)
provides a really nice introduction. Use `docker create` to create a
data container; the `-v` indicates the directory for the data container;
the `--name data_container` indicates the name of the data container;
and `ubuntu` is the image to be used for the container.
``` console
docker create -v /tmp --name data_container ubuntu
```
If we run a new Ubuntu container with the `--volumes-from` flag, output
written to the `/tmp` directory will be saved to the `/tmp` directory of
the `data_container` container.
``` console
docker run -it --volumes-from data_container ubuntu /bin/bash
```
## R
Use images from [The Rocker Project](https://www.rocker-project.org/),
for example `rocker/r-ver:4.3.0`.
``` bash
docker run --rm rocker/r-ver:4.3.0
```
## Unable to find image 'rocker/r-ver:4.3.0' locally
## 4.3.0: Pulling from rocker/r-ver
## 3c645031de29: Pulling fs layer
## eb5ba85ece65: Pulling fs layer
## 336082e130a7: Pulling fs layer
## d6f516f66899: Pulling fs layer
## e7191ae70de7: Pulling fs layer
## d6f516f66899: Waiting
## e7191ae70de7: Waiting
## eb5ba85ece65: Download complete
## d6f516f66899: Verifying Checksum
## d6f516f66899: Download complete
## 3c645031de29: Verifying Checksum
## 3c645031de29: Download complete
## 3c645031de29: Pull complete
## eb5ba85ece65: Pull complete
## e7191ae70de7: Download complete
## 336082e130a7: Verifying Checksum
## 336082e130a7: Download complete
## 336082e130a7: Pull complete
## d6f516f66899: Pull complete
## e7191ae70de7: Pull complete
## Digest: sha256:48fb09f63e1cbcc1b0ce3974a8f206bff0804b6921bb36dfa08eafa264dad542
## Status: Downloaded newer image for rocker/r-ver:4.3.0
##
## R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
## Copyright (C) 2023 The R Foundation for Statistical Computing
## Platform: x86_64-pc-linux-gnu (64-bit)
##
## R is free software and comes with ABSOLUTELY NO WARRANTY.
## You are welcome to redistribute it under certain conditions.
## Type 'license()' or 'licence()' for distribution details.
##
## Natural language support but running in an English locale
##
## R is a collaborative project with many contributors.
## Type 'contributors()' for more information and
## 'citation()' on how to cite R or R packages in publications.
##
## Type 'demo()' for some demos, 'help()' for on-line help, or
## 'help.start()' for an HTML browser interface to help.
## Type 'q()' to quit R.
##
## >
## Saving and transferring a Docker image
You should just share the Dockerfile used to create your image but if
you need another way to save and share an image, see [this
post](http://stackoverflow.com/questions/23935141/how-to-copy-docker-images-from-one-host-to-another-without-via-repository)
on Stack Overflow.
``` console
docker save -o
docker load -i
```
Here’s an example.
``` console
# save on Unix server
docker save -o davebox.tar davebox
# copy file to MacBook Pro
scp davetang@192.168.0.31:/home/davetang/davebox.tar .
docker load -i davebox.tar
93c22f563196: Loading layer [==================================================>] 134.6 MB/134.6 MB
...
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
davebox latest d38f27446445 10 days ago 3.46 GB
docker run davebox samtools
Program: samtools (Tools for alignments in the SAM format)
Version: 1.3 (using htslib 1.3)
Usage: samtools [options]
...
```
## Sharing your image
### Docker Hub
Create an account on [Docker Hub](https://hub.docker.com/); my account
is `davetang`. Use `docker login` to login and use `docker push` to push
to Docker Hub (run `docker tag` first if you didn’t name your image in
the format of `yourhubusername/newrepo`).
``` console
docker login
# create repo on Docker Hub then tag your image
docker tag bb38976d03cf yourhubusername/newrepo
# push
docker push yourhubusername/newrepo
```
### Quay.io
Create an account on [Quay.io](https://quay.io/); you can use Quay.io
for free as stated in their [plans](https://quay.io/plans/):
> Can I use Quay for free? Yes! We offer unlimited storage and serving
> of public repositories. We strongly believe in the open source
> community and will do what we can to help!
Use `docker login` to [login](https://docs.quay.io/guides/login.html)
and use the credentials you set up when you created an account on
Quay.io.
``` console
docker login quay.io
```
Quay.io images are prefixed with `quay.io`, so I used `docker image tag`
to create a new tag of my RStudio Server image. (Unfortunately, the
username `davetang` was taken on RedHat \[possibly by me a long time
ago\], so I have to use `davetang31` on Quay.io.)
``` console
docker image tag davetang/rstudio:4.2.2 quay.io/davetang31/rstudio:4.2.2
```
Push to Quay.io.
``` console
docker push quay.io/davetang31/rstudio:4.2.2
```
### GitHub Actions
[login-action](https://github.com/docker/login-action) is used to
automatically login to [Docker
Hub](https://github.com/docker/login-action#docker-hub) when using
GitHub Actions. This allows images to be automatically built and pushed
to Docker Hub. There is also support for
[Quay.io](https://github.com/docker/login-action#quayio).
## Tips
Tip from
:
each RUN, COPY, and ADD command in a Dockerfile generates another layer
in the container thus increasing its size; use multi-line commands and
clean up package manager caches to minimise image size:
``` console
RUN apt-get update \
&& apt-get install -y \
autoconf \
automake \
gcc \
g++ \
python \
python-dev \
&& apt-get clean all \
&& rm -rf /var/lib/apt/lists/*
```
I have found it handy to mount my current directory to the same path
inside a Docker container and to [set it as the working
directory](https://docs.docker.com/engine/reference/commandline/run/#set-working-directory--w);
the directory will be automatically created inside the container if it
does not already exist. When the container starts up, I will
conveniently be in my current directory. In the command below I have
also added the `-u` option, which sets the user to
`[:]`.
``` console
docker run --rm -it -u $(stat -c "%u:%g" ${HOME}) -v $(pwd):$(pwd) -w $(pwd) davetang/build:1.1 /bin/bash
```
If you do not want to preface `docker` with `sudo`, create a Unix group
called `docker` and add users to it. On some Linux distributions, the
system automatically creates this group when installing Docker Engine
using a package manager. In that case, there is no need for you to
manually create the group. Check `/etc/group` to see if the `docker`
group exists.
``` console
cat /etc/group | grep docker
```
If the `docker` group does not exist, create the group:
``` console
sudo groupadd docker
```
Add users to the group.
``` console
sudo usermod -aG docker $USER
```
The user will need to log out and log back in, before the changes take
effect.
On Linux, Docker is installed in `/var/lib/docker`.
``` console
docker info -f '{{ .DockerRootDir }}'
# /var/lib/docker
```
This may not be ideal depending on your partitioning. To change the
default root directory update the daemon configuration file; the default
location on Linux is `/etc/docker/daemon.json`. This file may not exist,
so you need to create it.
The example below makes `/home/docker` the Docker root directory; you
can use any directory you want but just make sure it exists.
``` console
cat /etc/docker/daemon.json
```
{
"data-root": "/home/docker"
}
Restart the Docker server (this will take a little time, since all the
files will be copied to the new location) and then check the Docker root
directory.
``` console
sudo systemctl restart docker
docker info -f '{{ .DockerRootDir}}'
```
/home/docker
Check out the new home!
``` console
sudo ls -1 /home/docker
```
buildkit
containers
engine-id
image
network
overlay2
plugins
runtimes
swarm
tmp
volumes
Use `--progress=plain` to show container output, which is useful for
debugging!
``` console
docker build --progress=plain -t davetang/scanpy:3.11 .
```
For Apple laptops using the the M\[123\] chips, use
`--platform linux/amd64` if that’s the architecture of the image.
docker run --rm --platform linux/amd64 -p 8787:8787 rocker/verse:4.4.1/
## Useful links
- [Post installation
steps](https://docs.docker.com/engine/install/linux-postinstall/)
- [A quick introduction to
Docker](http://blog.scottlowe.org/2014/03/11/a-quick-introduction-to-docker/)
- [The BioDocker project](https://github.com/BioDocker/biodocker); check
out their [Wiki](https://github.com/BioDocker/biodocker/wiki), which
has a lot of useful information
- [The impact of Docker containers on the performance of genomic
pipelines](http://www.ncbi.nlm.nih.gov/pubmed/26421241)
- [Learn enough Docker to be
useful](https://towardsdatascience.com/learn-enough-docker-to-be-useful-b0b44222eef5)
- [10 things to avoid in Docker
containers](http://developers.redhat.com/blog/2016/02/24/10-things-to-avoid-in-docker-containers/)
- The [Play with Docker
classroom](https://training.play-with-docker.com/) brings you labs and
tutorials that help you get hands-on experience using Docker
- [Shifter](https://github.com/NERSC/shifter) enables container images
for HPC
-
- Run the Docker daemon as a non-root user ([Rootless
mode](https://docs.docker.com/engine/security/rootless/))