In our previous post, we learned how Prometheus matches two instant vectors when an operator sits between them. We built a CPU activity percentage query step by step and previewed
sum without() to complete it.In this post, we will learn aggregation operators in full. Aggregation collapses many series into fewer series by computing a single value across a group. We will use
sum(), avg(), min(), max(), and count() — and we will control exactly how they group with the by and without clauses.Our setup is a Prometheus server scraping a Linux host via Node Exporter. The target has these labels:
instance="172.31.33.131:9100"job="node_exporter"
http://<our-server>:9090.Why We Need Aggregation
Node Exporter reportsnode_cpu_seconds_total with one series per CPU core per mode. On a four-core host, a query for the user mode returns four series — one per core. On a sixteen-core host, sixteen series.Most of the time, we do not want that level of detail. We want one number: total CPU time in user mode across all cores. Aggregation is how we collapse those four or sixteen series into one.
Without aggregation we read individual series. With aggregation we read the group.
The
Every aggregation operator accepts either a
We will use one or the other in every aggregation we write. Omitting both collapses all series into a single result — which is sometimes what we want, but usually not.
Let's start with the problem from Part 4.5. We have
We'll run this. Instead of one series per core, we get one series total — the sum of user-mode CPU time across all cores. The result carries only the
Without aggregation we read individual series. With aggregation we read the group.
The by and without Clauses
Every aggregation operator accepts either a by clause or a without clause. These control which labels survive the aggregation.by(l1, l2) keeps only the listed labels in the result. All other labels are dropped. Series that share the same values for the listed labels are collapsed into one.without(l1, l2) drops the listed labels and keeps everything else. Series that become identical after removing those labels are collapsed into one.We will use one or the other in every aggregation we write. Omitting both collapses all series into a single result — which is sometimes what we want, but usually not.
sum() — Add values across series
sum() adds the values of all series in a group. It is the operator we will reach for most often.Let's start with the problem from Part 4.5. We have
node_cpu_seconds_total with one series per core per mode. We want total CPU time in user mode across all cores:sum by(instance, job) (
node_cpu_seconds_total{instance="172.31.33.131:9100", mode="user"}
)instance and job labels we listed in by.We can write the same thing using
Both expressions produce the same result. The
without to drop the cpu and mode labels instead:sum without(cpu, mode) (
node_cpu_seconds_total{instance="172.31.33.131:9100", mode="user"}
)by version is more explicit about what we keep. The without version is more explicit about what we discard. On a metric with many labels, without is often shorter to write.Now let's build a real dashboard query. We want total available memory across all memory types. Node Exporter exposes several memory metrics —
This collapses to one number per host — exactly what a memory gauge panel on a Grafana dashboard needs.
MemAvailable, MemFree, Buffers, Cached. We want the single available figure:sum by(instance) (
node_memory_MemAvailable_bytes{instance="172.31.33.131:9100"}
)avg() — Average values across series
avg() computes the arithmetic mean across all series in a group. We will use it when we want a representative value rather than a total.Let's compute the average CPU idle percentage across all cores. First, we calculate the idle ratio per core
Next we calculate the total cpu seconds total per core, as a sum of all modes:
then we average across cores:
avg by(instance) (
node_cpu_seconds_total{instance="172.31.33.131:9100", mode="idle"}
/
sum without(mode) (node_cpu_seconds_total{instance="172.31.33.131:9100"})
) * 100We can also average load over time. Let's compute the average 1-minute load average across the last hour. We will use
avg_over_time() for this — a function we will cover in detail in the functions post:avg_over_time(
node_load1{instance="172.31.33.131:9100"}[1h]
)node_load1 sampled over the last hour — a smoothed view of load that filters out brief spikes.min() and max() — Extremes across series
min() returns the lowest value and max() returns the highest value across all series in a group.Let's find the filesystem with the least free space on our host — the one most at risk of filling up.
min by(instance) (
node_filesystem_avail_bytes{
instance="172.31.33.131:9100",
fstype!~"tmpfs|squashfs|devtmpfs"
} / (1024 * 1024 * 1024)
)To find which filesystem that is, we drop the
Let's find the most loaded CPU core:
by clause and keep all labels — but that returns all series, not just the minimum. We will handle that with topk() and bottomk() later in this post.Let's find the most loaded CPU core:
max by(instance) (
1 - (
node_cpu_seconds_total{instance="172.31.33.131:9100", mode="idle"}
/
sum without(mode) (node_cpu_seconds_total{instance="172.31.33.131:9100"})
)
) * 100count() — Count series in a group
count() returns the number of series in a group rather than computing on their values. We will use it to count resources, active alerts, or running processes.Let's count how many real filesystems our host has:
count by(instance) (
node_filesystem_avail_bytes{
instance="172.31.33.131:9100",
fstype!~"tmpfs|squashfs|devtmpfs"
}
)Let's count how many CPU cores the host has:
We'll run this. Each core produces one
count by(instance) (
node_cpu_seconds_total{instance="172.31.33.131:9100", mode="idle"}
)idle series, so the count equals the number of cores. On our host we will see 2 for a dual-core VM.topk() and bottomk() — Top or bottom N series
topk(n, expr) returns the N series with the highest values. bottomk(n, expr) returns the N series with the lowest values. These are the operators we use when we want to identify the worst offenders rather than just the extreme value.Let's find the two filesystems with the least free space:
bottomk(2,
node_filesystem_avail_bytes{
instance="172.31.33.131:9100",
fstype!~"tmpfs|squashfs|devtmpfs"
} / (1024 * 1024 * 1024)
)mountpoint. This is more useful than min() alone because we can see exactly which mounts are at risk.Let's find the CPU mode consuming the most time across all cores:
We'll run this. We get the three CPU modes with the highest total accumulated time — typically
topk(3,
sum by(mode) (
node_cpu_seconds_total{instance="172.31.33.131:9100"}
)
)idle, user, and system in that order on a healthy host. If iowait appears near the top, our host has a disk bottleneck.Putting It All Together
Let's build a query we will actually use on a MuleSoft host dashboard. We want the active CPU usage percentage — not idle, not iowait — averaged across all cores, expressed as a percentage.We will build it step by step.
Step 1 — Get active CPU seconds per core, excluding idle and iowait:
node_cpu_seconds_total{
instance="172.31.33.131:9100",
mode!~"idle|iowait"
}Step 2 — Sum active seconds across modes, keeping the

cpu label:sum by(instance, cpu) (
node_cpu_seconds_total{
instance="172.31.33.131:9100",
mode!~"idle|iowait"
}
)Step 3 — Sum total seconds across all modes per core:
sum by(instance, cpu) (
node_cpu_seconds_total{instance="172.31.33.131:9100"}
)Step 4 — Divide active by total to get the ratio per core:

sum by(instance, cpu) (
node_cpu_seconds_total{instance="172.31.33.131:9100", mode!~"idle|iowait"}
)
/
sum by(instance, cpu) (
node_cpu_seconds_total{instance="172.31.33.131:9100"}
)Step 5 — Average the per-core ratios across all cores and convert to percentage:
We'll run this final query. We get one number — the average active CPU usage across all cores as a percentage. This is the query we will use as the CPU utilization panel on our MuleSoft host dashboard.
avg by(instance) (
sum by(instance, cpu) (
node_cpu_seconds_total{instance="172.31.33.131:9100", mode!~"idle|iowait"}
)
/
sum by(instance, cpu) (
node_cpu_seconds_total{instance="172.31.33.131:9100"}
)
) * 100Aggregation Operator Quick Reference
| Operator | What it computes |
|---|---|
sum() | Total of all values in the group |
avg() | Arithmetic mean of all values in the group |
min() | Lowest value in the group |
max() | Highest value in the group |
count() | Number of series in the group |
topk(n, expr) | N series with the highest values |
bottomk(n, expr) | N series with the lowest values |
Clause | What it does |
|---|---|
by(l1, l2) | Keep only the listed labels — collapse everything else |
without(l1, l2) | Drop the listed labels — keep everything else |
Summary
Aggregation operators collapse many series into fewer series.sum() totals values across a group. avg() averages them. min()and max() find the extremes. count() counts the number of series. topk() and bottomk() return the N highest or lowest series with their labels intact. The by clause controls which labels survive by keeping only the ones we list. The withoutclause controls them by dropping only the ones we list. Every aggregation we write in a production dashboard or alert rule will use one of these two clauses explicitly.In the next post, we will learn about PromQL functions —
rate(), irate(), increase(), histogram_quantile(), and more. These are the tools we use to work with counters, histograms, and time-windowed computations.