Containerization via Docker

(1Q20)


This article covers the configuration and use cases of the official docker images for eXist-db

Running exist inside a container

Containers have become a popular means of distributing software without the need to worry hardware and software requirements other then the ability to run containers. Containers offer powerful features for continious deployment of production systems, and convenient ways for testing software without interference from external dependencies. How it looks on my computer is how it looks on yours.

The official images

We offer minimal images of eXist-db which are automatically updated as part of the build-test life-cycle. The images are based on Google Cloud Platform's Distroless Docker Images. You can find the source code here.

Next to fully tagged version, we have two rolling release channels:

  1. release: for the latest stable releases based on the master branch.

  2. latest: for last commit to the develop branch (within minutes of each commit).

For technical details about building your own images, and build time arguments please consult the README.md of each release. In cases where the information in this document contradicts that in the source-code repo the latter is authoritative. To inform us of conflicts please open an issue via the button on the right.

How to get started

First you need to download an images:

docker pull existdb/existdb:latest

Then you can start the container using that image:

docker run -it -d -p 8080:8080 -p 8443:8443 --name exist existdb/existdb:latest

What does this do?

You have just download and started eXist-db without launching an installer or having to provide java. More specifically:

  • -it: allocates a TTY and keeps STDIN open. This allows you to interact with the running Docker container via your console.

  • -d: detaches the container from the terminal that started it. So your container won't stop when you close the terminal.

  • -p: maps the Containers internal and external port assignments (we recommend sticking with matching pairs). This allows you to connect to the eXist-db Web Server running in the Docker container.

  • --name: lets you provide a name (instead of using a randomly generated one)

The only required parts are docker run existdb/existdb.

For a full list of available options see the official Docker documentation. You can now access your running instance by going go localhost:8080 inside your browser. To stop the container:

docker stop exist

Interacting with the running container

You can interact with a running container as if it were a regular Linux host.

Important:

GCR base images do not contain a shell by design. You can issue shell-like commands to the Java Admin Client, as we do throughout this readme, but you can't open the shell in interactive mode.

We'll continue to use exist as the name of our container:

# Using java syntax on a running eXist-db instances
docker exec exist java org.exist.start.Main client --no-gui --xpath "system:get-version()"

# Interacting with the JVM
docker exec exist java -version

Containers build from this image run a periodical health-check to make sure that eXist-db is operating normally. If docker ps reports unhealthy you can get a more detailed report with this command:

docker inspect --format='{{json .State.Health}}' exist

To check exist's logs: docker logs exist

Use as Base Image

A common usage of these images is as a base image for your own applications. We'll take a quick look at three scenarios of increasing complexity, to demonstrate how to achieve common tasks from inside Dockerfile.

A simple base image

The simplest case assumes that you have a .xar app inside a build folder on the same level as your own Dockerfile. To get an image of an eXist-db instance with your app installed and running, you would then:

FROM existdb/existdb:5.0.0

COPY build/*.xar /exist/autodeploy

You should see something like this:

Sending build context to Docker daemon  4.337MB
Step 1/2 : FROM existdb/existdb:5.0.0
 ---> 3f4dbbce9afa
Step 2/2 : COPY build/*.xar /exist/autodeploy
 ---> ace38b0809de

The result is a new image of your app installed into eXist-db. Since you didn't provide further instructions it will simply reuse the EXPOSE, CMD, HEALTHCHECK, etc instructions defined by the base image. You can now publish this image to a docker registry and share it with others.

Single Stage Image

The following slightly more complex example will install your app, but also modify the underlying eXist-db instance in which your app is running. Instead of a local build directory, we'll download the .xar from the web, and copy a modified conf.xml from a src/ directory along side your Dockerfile.

To execute any of the docker exec … style commands from this readme, use RUN.

FROM existdb/existdb

# NOTE: this is for syntax demo purposes only
RUN [ "java", "org.exist.start.Main", "client", "--no-gui",  "-l", "-u", "admin", "-P", "", "-x", "sm:passwd('admin','123')" ]

# use a modified conf.xml
COPY src/conf.xml /exist/etc

ADD https://github.com/eXist-db/documentation/releases/download/4.0.4/exist-documentation-4.0.4.xar /exist/autodeploy

The above demonstrates different kind of operations available to you in a single stage build. You have just executed the Java Admin Client from inside a Dockerfile, which in turn allows you to run any XQuery code you want when modifying the eXist-db instance that will ship with your images. You can also chain multiple RUN commands.

Warning:

For security reasons more elaborate techniques for not sharing your password in the clear are highly recommended, such as the use of secure variables inside your CI environment.

Multi-stage Images

Lastly, you can eliminate external dependencies even further by using a multi-stage build. To ensure compatibility between different Java engines we recommend sticking with debian based images for the builder stage.

The following two-stage build will download and install ant and nodeJS into a builder stage which then downloads frontend dependencies before building the .xar file. The second stage (each FROM begins a stage) is the simple example again from before. Such a setup ensures that non of your collaborators has to have java, nodeJS, ect installed, and is great for fully automated builds and deployment.

# START STAGE 1
FROM openjdk:8-jdk-slim as builder

USER root

ENV ANT_VERSION 1.10.5
ENV ANT_HOME /etc/ant-${ANT_VERSION}

WORKDIR /tmp

RUN wget http://www-us.apache.org/dist/ant/binaries/apache-ant-${ANT_VERSION}-bin.tar.gz \
    && mkdir ant-${ANT_VERSION} \
    && tar -zxvf apache-ant-${ANT_VERSION}-bin.tar.gz \
    && mv apache-ant-${ANT_VERSION} ${ANT_HOME} \
    && rm apache-ant-${ANT_VERSION}-bin.tar.gz \
    && rm -rf ant-${ANT_VERSION} \
    && rm -rf ${ANT_HOME}/manual \
    && unset ANT_VERSION

ENV PATH ${PATH}:${ANT_HOME}/bin

WORKDIR /home/my-app
COPY . .
RUN apk add --no-cache --virtual .build-deps \
 nodejs \
 nodejs-npm \
 git \
 && npm i npm@latest -g \
 && ant


# START STAGE 2
FROM existdb/existdb:release

COPY --from=builder /home/my-app/build/*.xar /exist/autodeploy

EXPOSE 8080 8443

CMD [ "java", "org.exist.start.Main", "jetty" ]

The basic idea behind multi-staging is that everything you need for building your software should be managed by docker, so that all collaborators can rely on one stable environment. In the end, and after how ever many stages you need, only the files necessary to run your app should go into the final stage. The possibilities are virtually endless, but with this example and the Dockerfile in this repo you should get a pretty good idea of how you might apply this idea to your own projects.

Caveats

  • JVM inside a container require some special considerations regarding memory allocation. You should familiarize yourself with our images use of JAVA_TOOL_OPTIONS, and avoid the traditional way of setting the heap size via -Xmx.

  • Containers rely on advanced CPUartichetture features to do their magic. You should always consult the official docker documentation for your operation system to see if containers are supported on your hardware and/or software.

Note: macOS

Apple's macOS uses its own version of hypervisor which can result in poor i/o performance compared to running on other platforms