VOOZH about

URL: https://dzone.com/articles/health-checking-your-docker-containers

⇱ Health Checking Your Docker Containers


Related

  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Health Checking Your Docker Containers

Health Checking Your Docker Containers

Are your containers feeling under the weather? Struggling to get out of bed? See how you can build in health checks to make sure your containers are fighting fit.

By Nov. 25, 16 · Tutorial
Likes
Comment
Save
53.1K Views

Join the DZone community and get the full member experience.

Join For Free

One of the new features in Docker 1.12 is how health check for a container can be baked into the image definition. And this can be overridden at the command line.

Just like the CMD instruction, there can be multiple HEALTHCHECK instructions in Dockerfile but only the last one is effective.

This is a great addition because a container reporting status as Up 1 hour may return errors. The container may be up but there is no way for the application inside the container to provide a status. This instruction fixes that.

The Dockerfile that builds arungupta/couchbase image is:

FROM couchbase:latest

COPY configure-node.sh /opt/couchbase

HEALTHCHECK --interval=5s --timeout=3s CMD curl --fail http://localhost:8091/pools || exit 1

CMD ["/opt/couchbase/configure-node.sh"]


It uses a configure-node.sh script to configure the server using the Couchbase REST API. The new instruction to notice here is HEALTHCHECK.

This instruction can be specified as:

HEALTHCHECK <options> CMD <command>


The <options> can be:

  • --interval=DURATION (default 30s).

  • --timeout=DURATION (default 30s).

  • --retries=N (default 3).

The <command> is the command that runs inside the container to check the health.

If health check is enabled, then the container can have three states:

  • Starting: Initial status when the container is still starting.

  • Healthy: If the command succeeds, then the container is healthy.

  • Unhealthy: If a single run of the <command> takes longer than the specified timeout, then it is considered unhealthy. If a health check fails, then the <command> will run retries number of times and will be declared unhealthy if the <command> still fails.

The commands exit status indicates the health status of the container. The following values are allowed:

  • 0: container is healthy.

  • 1: container is not healthy.

In our instruction, /pools REST API is invoked using curl. If the command fails then an exit status of 1 is returned, and this marks the container unhealthy for that attempt. This command is invoked every 5 seconds. The container is marked unhealthy if the command does not return successfully within 3 seconds.

Run the container as:

docker run -d --name db arungupta/couchbase:latest


Check the status:

docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
55b14302671e arungupta/couchbase:latest "/entrypoint.sh /opt/" 2 seconds ago Up 1 seconds (health: starting) 8091-8094/tcp, 11207/tcp, 11210-11211/tcp, 18091-18093/tcp db


Notice how health: starting status is reported in the STATUS column. Checking after a few seconds shows the status:

docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
55b14302671e arungupta/couchbase:latest "/entrypoint.sh /opt/" About a minute ago Up About a minute (healthy) 8091-8094/tcp, 11207/tcp, 11210-11211/tcp, 18091-18093/tcp db


And now it's reported healthy.

More details about this HEALTHCHECK instruction can be found on docs.docker.com.

Now, if you are running an image that does not have HEALTHCHECK instruction then the docker run command can be used to specify similar values. An equivalent runtime command would be:

docker run -d --name db --health-cmd "curl --fail http://localhost:8091/pools || exit 1" --health-interval=5s --timeout=3s arungupta/couchbase


The last five health checks for a container can be obtained using the docker inspect command:

docker inspect --format='{{json .State.Health}}' db


The output is shown as:

{
 "Status": "healthy",
 "FailingStreak": 0,
 "Log": [
 {
 "Start": "2016-11-12T03:23:03.351561Z",
 "End": "2016-11-12T03:23:03.422176171Z",
 "ExitCode": 0,
 "Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 768 100 768 0 0 595k 0 --:--:-- --:--:-- --:--:-- 750k\n{\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}}"
 },
 {
 "Start": "2016-11-12T03:23:08.423558928Z",
 "End": "2016-11-12T03:23:08.510122392Z",
 "ExitCode": 0,
 "Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 768 100 768 0 0 309k 0 --:--:-- --:--:-- --:--:-- 375k\n{\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}}"
 },
 {
 "Start": "2016-11-12T03:23:13.511446818Z",
 "End": "2016-11-12T03:23:13.58141325Z",
 "ExitCode": 0,
 "Output": " {\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}} % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 768 100 768 0 0 248k 0 --:--:-- --:--:-- --:--:-- 375k\n"
 },
 {
 "Start": "2016-11-12T03:23:18.583512367Z",
 "End": "2016-11-12T03:23:18.677727356Z",
 "ExitCode": 0,
 "Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dlo{\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}}ad Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 768 100 768 0 0 307k 0 --:--:-- --:--:-- --:--:-- 375k\n"
 },
 {
 "Start": "2016-11-12T03:23:23.679661467Z",
 "End": "2016-11-12T03:23:23.782372291Z",
 "ExitCode": 0,
 "Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left{\"isAdminCreds\":true,\"isROAdminCreds\":false,\"isEnterprise\":true,\"pools\":[{\"name\":\"default\",\"uri\":\"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"streamingUri\":\"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969\"}],\"settings\":{\"maxParallelIndexers\":\"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969\",\"viewUpdateDaemon\":\"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969\"},\"uuid\":\"1b84cdbd136e4e8466049dd062dd6969\",\"implementationVersion\":\"4.5.1-2844-enterprise\",\"componentsVersion\":{\"lhttpc\":\"1.3.0\",\"os_mon\":\"2.2.14\",\"public_key\":\"0.21\",\"asn1\":\"2.0.4\",\"kernel\":\"2.16.4\",\"ale\":\"4.5.1-2844-enterprise\",\"inets\":\"5.9.8\",\"ns_server\":\"4.5.1-2844-enterprise\",\"crypto\":\"3.2\",\"ssl\":\"5.3.3\",\"sasl\":\"2.3.4\",\"stdlib\":\"1.19.4\"}} Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 768 100 768 0 0 439k 0 --:--:-- --:--:-- --:--:-- 750k\n"
 }
 ]
}


Related Refcard:

Docker (software) Health (Apple)

Published at DZone with permission of Arun Gupta. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • AI Agents in Java: Architecting Intelligent Health Data Systems
  • Smart Deployment Strategies for Modern Applications
  • Solving the Mystery: Why Java RSS Grows in Docker on M1 Macs
  • How We Diagnosed a Hidden Scheduler Failure in a Docker Swarm Cluster Serving 2 Million Users

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

Let's be friends: