PromQL Operators — Part 2 (Aggregation)

In our previous post, we learned how Prometheus matches two instant vectors when an operator sits between them. We built a CPU activity percentage query step by step and previewed sum without() to complete it.

In this post, we will learn aggregation operators in full. Aggregation collapses many series into fewer series by computing a single value across a group. We will use sum(), avg(), min(), max(), and count() — and we will control exactly how they group with the by and without clauses.

Our setup is a Prometheus server scraping a Linux host via Node Exporter. The target has these labels:

instance="172.31.33.131:9100"
job="node_exporter"

All examples will run in the Prometheus UI at http://<our-server>:9090.

Why We Need Aggregation

Node Exporter reports node_cpu_seconds_total with one series per CPU core per mode. On a four-core host, a query for the user mode returns four series — one per core. On a sixteen-core host, sixteen series.

Most of the time, we do not want that level of detail. We want one number: total CPU time in user mode across all cores. Aggregation is how we collapse those four or sixteen series into one.
Without aggregation we read individual series. With aggregation we read the group.

The `by` and `without` Clauses

Every aggregation operator accepts either a by clause or a without clause. These control which labels survive the aggregation.

by(l1, l2) keeps only the listed labels in the result. All other labels are dropped. Series that share the same values for the listed labels are collapsed into one.

without(l1, l2) drops the listed labels and keeps everything else. Series that become identical after removing those labels are collapsed into one.

We will use one or the other in every aggregation we write. Omitting both collapses all series into a single result — which is sometimes what we want, but usually not.

`sum()` — Add values across series

sum() adds the values of all series in a group. It is the operator we will reach for most often.
Let's start with the problem from Part 4.5. We have node_cpu_seconds_total with one series per core per mode. We want total CPU time in user mode across all cores:

sum by(instance, job) (
  node_cpu_seconds_total{instance="172.31.33.131:9100", mode="user"}
)

We'll run this. Instead of one series per core, we get one series total — the sum of user-mode CPU time across all cores. The result carries only the instance and job labels we listed in by.

We can write the same thing using without to drop the cpu and mode labels instead:

sum without(cpu, mode) (
  node_cpu_seconds_total{instance="172.31.33.131:9100", mode="user"}
)

Both expressions produce the same result. The by version is more explicit about what we keep. The without version is more explicit about what we discard. On a metric with many labels, without is often shorter to write.

Now let's build a real dashboard query. We want total available memory across all memory types. Node Exporter exposes several memory metrics — MemAvailable, MemFree, Buffers, Cached. We want the single available figure:

sum by(instance) (
  node_memory_MemAvailable_bytes{instance="172.31.33.131:9100"}
)

This collapses to one number per host — exactly what a memory gauge panel on a Grafana dashboard needs.

`avg()` — Average values across series

avg() computes the arithmetic mean across all series in a group. We will use it when we want a representative value rather than a total.

Let's compute the average CPU idle percentage across all cores. First, we calculate the idle ratio per core

Next we calculate the total cpu seconds total per core, as a sum of all modes:

then we average across cores:

avg by(instance) (
  node_cpu_seconds_total{instance="172.31.33.131:9100", mode="idle"}
  /
  sum without(mode) (node_cpu_seconds_total{instance="172.31.33.131:9100"})
) * 100

We'll run this. We get one number — the average idle percentage across all CPU cores on our host. If core 0 is at 95% idle and core 1 is at 85% idle, this returns 90%.

We can also average load over time. Let's compute the average 1-minute load average across the last hour. We will use avg_over_time() for this — a function we will cover in detail in the functions post:

avg_over_time(
  node_load1{instance="172.31.33.131:9100"}[1h]
)

We'll run this. We get the mean value of node_load1 sampled over the last hour — a smoothed view of load that filters out brief spikes.

`min()` and `max()` — Extremes across series

min() returns the lowest value and max() returns the highest value across all series in a group.
Let's find the filesystem with the least free space on our host — the one most at risk of filling up.

min by(instance) (
  node_filesystem_avail_bytes{
    instance="172.31.33.131:9100",
    fstype!~"tmpfs|squashfs|devtmpfs"
  } / (1024 * 1024 * 1024)
)

We'll run this. We get the single smallest free space value across all real filesystems, expressed in GB. This is useful for a summary panel that shows the worst-case disk situation at a glance.

To find which filesystem that is, we drop the by clause and keep all labels — but that returns all series, not just the minimum. We will handle that with topk() and bottomk() later in this post.

Let's find the most loaded CPU core:

max by(instance) (
  1 - (
    node_cpu_seconds_total{instance="172.31.33.131:9100", mode="idle"}
    /
    sum without(mode) (node_cpu_seconds_total{instance="172.31.33.131:9100"})
  )
) * 100

We'll run this. We get the busiest CPU core's activity percentage. On a MuleSoft host under load, this tells us the worst-case core utilization — a better signal for CPU pressure than the average.

`count()` — Count series in a group

count() returns the number of series in a group rather than computing on their values. We will use it to count resources, active alerts, or running processes.

Let's count how many real filesystems our host has:

count by(instance) (
  node_filesystem_avail_bytes{
    instance="172.31.33.131:9100",
    fstype!~"tmpfs|squashfs|devtmpfs"
  }
)

We'll run this. We get the number of real filesystems Prometheus is monitoring on our host. The value of each metric does not matter — only the number of series.

Let's count how many CPU cores the host has:

count by(instance) (
  node_cpu_seconds_total{instance="172.31.33.131:9100", mode="idle"}
)

We'll run this. Each core produces one idle series, so the count equals the number of cores. On our host we will see 2 for a dual-core VM.

`topk()` and `bottomk()` — Top or bottom N series

topk(n, expr) returns the N series with the highest values. bottomk(n, expr) returns the N series with the lowest values. These are the operators we use when we want to identify the worst offenders rather than just the extreme value.

Let's find the two filesystems with the least free space:

bottomk(2,
  node_filesystem_avail_bytes{
    instance="172.31.33.131:9100",
    fstype!~"tmpfs|squashfs|devtmpfs"
  } / (1024 * 1024 * 1024)
)

We'll run this. We get the two filesystems with the lowest free space in GB — their full label sets intact, including mountpoint. This is more useful than min() alone because we can see exactly which mounts are at risk.

Let's find the CPU mode consuming the most time across all cores:

topk(3,
  sum by(mode) (
    node_cpu_seconds_total{instance="172.31.33.131:9100"}
  )
)

We'll run this. We get the three CPU modes with the highest total accumulated time — typically idle, user, and system in that order on a healthy host. If iowait appears near the top, our host has a disk bottleneck.

Putting It All Together

Let's build a query we will actually use on a MuleSoft host dashboard. We want the active CPU usage percentage — not idle, not iowait — averaged across all cores, expressed as a percentage.

We will build it step by step.

Step 1 — Get active CPU seconds per core, excluding idle and iowait:

node_cpu_seconds_total{
  instance="172.31.33.131:9100",
  mode!~"idle|iowait"
}

Step 2 — Sum active seconds across modes, keeping the cpu label:

sum by(instance, cpu) (
  node_cpu_seconds_total{
    instance="172.31.33.131:9100",
    mode!~"idle|iowait"
  }
)

Step 3 — Sum total seconds across all modes per core:

sum by(instance, cpu) (
  node_cpu_seconds_total{instance="172.31.33.131:9100"}
)

Step 4 — Divide active by total to get the ratio per core:

sum by(instance, cpu) (
  node_cpu_seconds_total{instance="172.31.33.131:9100", mode!~"idle|iowait"}
)
/
sum by(instance, cpu) (
  node_cpu_seconds_total{instance="172.31.33.131:9100"}
)

Step 5 — Average the per-core ratios across all cores and convert to percentage:

avg by(instance) (
  sum by(instance, cpu) (
    node_cpu_seconds_total{instance="172.31.33.131:9100", mode!~"idle|iowait"}
  )
  /
  sum by(instance, cpu) (
    node_cpu_seconds_total{instance="172.31.33.131:9100"}
  )
) * 100

We'll run this final query. We get one number — the average active CPU usage across all cores as a percentage. This is the query we will use as the CPU utilization panel on our MuleSoft host dashboard.

Aggregation Operator Quick Reference

Operator	What it computes
`sum()`	Total of all values in the group
`avg()`	Arithmetic mean of all values in the group
`min()`	Lowest value in the group
`max()`	Highest value in the group
`count()`	Number of series in the group
`topk(n, expr)`	N series with the highest values
`bottomk(n, expr)`	N series with the lowest values

Clause	What it does
`by(l1, l2)`	Keep only the listed labels — collapse everything else
`without(l1, l2)`	Drop the listed labels — keep everything else

Summary

Aggregation operators collapse many series into fewer series. sum() totals values across a group. avg() averages them. min()and max() find the extremes. count() counts the number of series. topk() and bottomk() return the N highest or lowest series with their labels intact. The by clause controls which labels survive by keeping only the ones we list. The withoutclause controls them by dropping only the ones we list. Every aggregation we write in a production dashboard or alert rule will use one of these two clauses explicitly.

In the next post, we will learn about PromQL functions — rate(), irate(), increase(), histogram_quantile(), and more. These are the tools we use to work with counters, histograms, and time-windowed computations.

PromQL Operators — Part 2 (Aggregation)

Why We Need Aggregation

The `by` and `without` Clauses

`sum()` — Add values across series

`avg()` — Average values across series

`min()` and `max()` — Extremes across series

`count()` — Count series in a group

`topk()` and `bottomk()` — Top or bottom N series

Putting It All Together

Aggregation Operator Quick Reference

Summary

How to plan a (successful) VPN migration - Part I

Contact Form

PromQL Operators — Part 2 (Aggregation)

Why We Need Aggregation

The by and without Clauses

sum() — Add values across series

avg() — Average values across series

min() and max() — Extremes across series

count() — Count series in a group

topk() and bottomk() — Top or bottom N series

Putting It All Together

Aggregation Operator Quick Reference

Summary

Contact Form

The `by` and `without` Clauses

`sum()` — Add values across series

`avg()` — Average values across series

`min()` and `max()` — Extremes across series

`count()` — Count series in a group

`topk()` and `bottomk()` — Top or bottom N series