Monitor Containers with Prometheus: Docker Engine & cAdvisor


We can run Mule Runtimes in containers using Runtime Fabric or directly containerizing the runtime. Containers give us portability, fast deployments, and consistent environments. But containers also hide what is happening inside them. We cannot see CPU usage, memory pressure, or network traffic without the right tools.

That is where Prometheus comes in. In a previous post, we configured Prometheus to collect metrics from Linux hosts using the Node Exporter. Now we go one level deeper. We collect metrics from containerized environments. Prometheus pulls these metrics from two sources:
  • Docker Engine — reports daemon-level metrics about the Docker host
  • cAdvisor — reports per-container metrics for CPU, memory, disk, and network
By the end of this post, we’ll have both sources feeding Prometheus. We’ll also understand when to use each one. This foundation will prepare us for the next posts in this series, where we monitor Kubernetes and Runtime Fabric.


What Problem Are We Solving?

A container is an isolated process. The Linux kernel runs it. But from outside the container, we see very little by default.

We face three gaps:

GapWithout monitoringWith Prometheus + cAdvisor
Container healthWe cannot tell if a container is consuming too much memoryWe see memory usage per container in real time
Resource contentionWe cannot tell which container is starving othersWe see CPU and network usage per container
Daemon healthWe cannot tell if the Docker engine itself is under stressWe see connection counts, image build times, and daemon errors
Prometheus closes all three gaps. We just need to expose the right endpoints.


Two Sources, Two Scopes

Before we write any configuration, we need to understand what each source provides and why both matter.
  • Docker Engine metrics come from the Docker daemon itself. The daemon is the background service that manages all containers on the host. Its metrics tell us about the engine — how many containers are running, how many images exist, how many failed operations occurred. These metrics describe the platform, not the workloads.
  • cAdvisor metrics come from inside each container. cAdvisor stands for Container Advisor. Google built it. It runs as a container itself and reads resource usage data directly from the Linux cgroups subsystem. Its metrics describe the workloads — CPU, memory, disk I/O, and network per container.
We need both. Docker Engine tells us if the platform is healthy. cAdvisor tells us if our applications are healthy.


Part 1 — Enable Docker Engine Metrics

Docker Engine exposes a Prometheus-compatible metrics endpoint. It is disabled by default. We turn it on by editing the Docker daemon configuration file.


Step 1 — Edit the Docker Daemon Config

Open the daemon configuration file:

sudo nano /etc/docker/daemon.json

Add the following configuration:

{
"metrics-addr": "0.0.0.0:9323",
"experimental": true
}
Security note: Binding to 0.0.0.0 exposes the metrics endpoint on all network interfaces. In production, bind to 127.0.0.1 and use Prometheus on the same host, or use a firewall rule to restrict access to the Prometheus server IP.

Step 2 — Restart Docker

We’ll apply the change by restarting the Docker daemon:

sudo systemctl restart docker


Step 3 — Verify the Endpoint

We can confirm the endpoint is active with the following command from the docker host:

curl http://localhost:9323/metrics

We'll see an output like this:

# HELP builder_builds_failed_total Number of failed image builds
# TYPE builder_builds_failed_total counter
builder_builds_failed_total{reason="build_canceled"} 0
builder_builds_failed_total{reason="build_target_not_reachable_error"} 0
...
engine_daemon_container_states_containers{state="paused"} 0
engine_daemon_container_states_containers{state="running"} 3
engine_daemon_container_states_containers{state="stopped"} 1

The endpoint is live. Prometheus can now scrape it.


Step 4 — Add Docker Engine to Prometheus

Go to the host where Prometheus is running and open our Prometheus configuration file:

sudo nano /etc/prometheus/prometheus.yml

We need to add a new scrape job under 
scrape_configs:

scrape_configs:
- job_name: 'docker'
static_configs:
- targets: ['[DOCKER_HOST_IP]:9323']

Restart Prometheus:

sudo systemctl restart prometheus

Open the Prometheus UI at 
http://<your-host>:9090. Navigate to Status → Target Health and confirm the docker job shows UP.



Part 2 — Enable cAdvisor for Container Metrics

cAdvisor (Container Advisor) is an open-source tool built by Google. It runs as a container on the Docker host and reads resource usage data directly from the Linux cgroups subsystem — the kernel feature that enforces resource limits on containers. Because cAdvisor reads from cgroups, it sees exactly what the kernel sees: raw CPU time, memory allocation, disk I/O, and network traffic for every container on the host.

Docker Engine metrics tell us the platform is running. cAdvisor tells us what our applications are doing inside it. For MuleSoft deployments, this distinction matters. We need to know if a Mule Runtime container is consuming too much heap, saturating a network interface, or hitting disk limits — before those conditions cause failures. cAdvisor exposes all of that data as a Prometheus-compatible metrics endpoint, ready to scrape.

Let’s see how to enable it.


Step 1 — Run cAdvisor

cAdvisor runs as a container. It mounts host directories to read cgroup data. We start it with a single docker run command:

docker run -d \
--name cadvisor \
--restart unless-stopped \
--volume /:/rootfs:ro \
--volume /var/run:/var/run:ro \
--volume /sys:/sys:ro \
--volume /var/lib/docker/:/var/lib/docker:ro \
--volume /dev/disk/:/dev/disk:ro \
--publish 8080:8080 \
--privileged \
--device /dev/kmsg \
gcr.io/cadvisor/cadvisor:latest

We break down the key flags:

FlagPurpose
--volume /:/rootfs:roMounts the host root filesystem read-only. cAdvisor reads cgroup data from here.
--volume /sys:/sys:roMounts kernel system data. Required for CPU and memory metrics.
--volume /var/lib/docker/:/var/lib/docker:roMounts Docker storage. Required to inspect container filesystems.
--privilegedGrants elevated access to read kernel-level resource data.
--device /dev/kmsgProvides access to kernel message logs. Required on some Linux kernels.
--publish 8080:8080Exposes the cAdvisor UI and metrics endpoint on port 8080.
--restart unless-stoppedRestarts cAdvisor automatically unless we stop it manually.


Step 2 — Verify cAdvisor

Check that cAdvisor is running:

docker ps --filter name=cadvisor

Test the metrics endpoint:

curl http://localhost:8080/metrics | head -30


Also, open the cAdvisor UI in a browser at http://<your-host>:8080. It will show a live dashboard of container resource usage.



Step 3 — Add cAdvisor to Prometheus

We open the Prometheus configuration file again:

sudo nano /etc/prometheus/prometheus.yml

We add the cAdvisor scrape job:

scrape_configs:
- job_name: 'docker'
static_configs:
- targets: ['[DOCKER_HOST_IP]:9323']

- job_name: 'cadvisor'
static_configs:
- targets: ['[DOCKER_HOST_IP]:8080']

We reload Prometheus:

sudo systemctl restart prometheus

We return to 
Status → Targets Health in the Prometheus UI. Both docker and cadvisor jobs now show UP.


Docker Engine vs. cAdvisor: Key Metrics Compared

Now that both sources are active, we see a large volume of metrics. Here are the most useful ones from each source.


Docker Engine Metrics (port 9323)

These metrics describe the Docker daemon and the host-level container state.

MetricDescription
engine_daemon_container_states_containersCount of containers by state: running, stopped, paused
engine_daemon_health_checks_totalTotal health check executions across all containers
engine_daemon_health_checks_failed_totalTotal failed health checks
builder_builds_failed_totalFailed image build attempts, broken down by reason
engine_daemon_network_actions_secondsTime spent on network operations

Use these metrics to answer: Is the Docker engine healthy? How many containers are running right now? Are image builds failing?


cAdvisor Metrics (port 8080)

These metrics describe individual containers. Every metric includes a container_label_com_docker_compose_service or name label that identifies the container.

MetricDescription
container_cpu_usage_seconds_totalTotal CPU time consumed by a container
container_memory_usage_bytesCurrent memory usage including cache
container_memory_working_set_bytesActive memory usage, excluding cache. This is the most accurate memory metric.
container_network_receive_bytes_totalTotal bytes received on a container's network interface
container_network_transmit_bytes_totalTotal bytes transmitted from a container
container_fs_reads_bytes_totalTotal bytes read from disk by a container
container_fs_writes_bytes_totalTotal bytes written to disk by a container
container_last_seenTimestamp of last cAdvisor observation. Useful for detecting stopped containers.

Use these metrics to answer: Which container consumes the most CPU? Is a Mule Runtime application running out of memory? Is a container generating abnormal disk I/O?


Summary: When to Use Each Source


QuestionSource
Is the Docker daemon running normally?Docker Engine
How many containers are running?Docker Engine
Are health checks passing?Docker Engine
How much CPU does this Mule app use?cAdvisor
Is this container close to its memory limit?cAdvisor
Which container is generating the most network traffic?cAdvisor


Useful PromQL Queries

Here are some useful queries we can use in the Prometheus UI or in Grafana dashboards.


Count running containers:

engine_daemon_container_states_containers{state="running"}


CPU usage rate per container (last 5 minutes):

rate(container_cpu_usage_seconds_total[5m])


Memory working set per container (in MB):

container_memory_working_set_bytes / 1024 / 1024


Network receive rate per container (bytes per second):

rate(container_network_receive_bytes_total[5m])


Containers with failed health checks:

increase(engine_daemon_health_checks_failed_total[10m]) > 0


Complete Prometheus Configuration

Here is our full prometheus.yml with both scrape jobs:

global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:

# Linux host metrics (from previous post)
- job_name: 'node'
static_configs:
- targets: ['[EXTERNAL_SERVER_IP]:9100']

# Docker Engine daemon metrics
- job_name: 'docker'
static_configs:
- targets: ['[DOCKER_HOST_IP]:9323']

# Per-container metrics via cAdvisor
- job_name: 'cadvisor'
scrape_interval: 10s
static_configs:
- targets: ['[DOCKER_HOST_IP]:8080']

 

We set scrape_interval: 10s on the cAdvisor job. Container metrics change quickly. A shorter interval gives us finer-grained data for alerting.


What’s Next?

We now collect metrics from Linux hosts, Docker Engine, and individual containers. That covers the single-host case.

But MuleSoft Runtime Fabric and production MuleSoft deployments run on Kubernetes. Kubernetes adds layers: nodes, pods, namespaces, and the control plane. Each layer produces its own metrics.

In the next posts in this series, we’ll go deeper and see How to Monitor Kubernetes with Prometheus. That will be a solid foundation to start monitoring Runtime Fabric deployments.

Summary

We covered the full picture of container monitoring with Prometheus.
  • Docker Engine metrics expose daemon health on port 9323. We enable them by editing /etc/docker/daemon.json.
  • cAdvisor exposes per-container resource metrics on port 8080. We run it as a privileged container with host mounts.
  • Both sources serve different purposes. Docker Engine answers platform questions. cAdvisor answers application questions.
  • We add both as scrape targets in prometheus.yml and query them with PromQL.
Container visibility is not optional for production MuleSoft deployments. We cannot fix what we cannot see.
Previous Post Next Post