PromQL - Vector Matching

In our last post, we learned how arithmetic, comparison, and logical operators work. We wrote expressions like dividing available memory by total memory to get a percentage.

What we did not explain is how Prometheus knows which series to pair together when two instant vectors meet an operator. On our single-host setup, it seems obvious — there is only one series on each side. But as soon as we have multiple CPUs, multiple filesystems, or multiple hosts, Prometheus needs explicit rules to decide which series on the left matches which series on the right.

Those rules are called vector matching. Understanding them is what separates queries that work by accident from queries that work by design.

Our setup is a Prometheus server scraping a Linux host via Node Exporter. The target has these labels:

instance="172.31.33.131:9100"
job="node_exporter"

All examples will run in the Prometheus UI at http://<our-server>:9090.

How Prometheus Matches Two Vectors

When an operator sits between two instant vectors, Prometheus tries to pair each series on the left with exactly one series on the right. It does this by comparing labels.

By default, Prometheus matches on all labels. Two series match if every label — name and value — is identical on both sides. If a pair is found, the operator is applied and the result carries those shared labels. If no match is found, the series is dropped from the result. This is called one-to-one matching.

Let's see it in action. We'll divide available filesystem bytes by total filesystem bytes:

node_filesystem_avail_bytes{instance="172.31.33.131:9100", fstype!~"tmpfs|squashfs|devtmpfs"}
/
node_filesystem_size_bytes{instance="172.31.33.131:9100", fstype!~"tmpfs|squashfs|devtmpfs"}

We'll run this in the Prometheus UI. Prometheus pairs each node_filesystem_avail_bytes series with the node_filesystem_size_bytes series that carries the exact same device, fstype, instance, job, and mountpoint labels. One result per filesystem — the ratio of free space to total space.

This works because both metrics share the same label set. The default one-to-one matching handles it cleanly.

When Default Matching Breaks

Default matching fails when the two sides of an operator do not share the same label set.
Node Exporter exposes node_cpu_seconds_total with a cpu label and a mode label. It also exposes node_cpu_info with a cpu label but no mode label. If we try to divide one by the other with default matching, Prometheus finds no pairs — the label sets differ — and returns an empty result.

However, a mismatch or an extra label on either metric will prevent the match from occurring.
For example, this mismatch will not produce a result:

node_filesystem_avail_bytes{instance="node1", job="node", mountpoint="/home"} 512
node_filesystem_size_bytes{instance="node2", job="node", mountpoint="/home"} 1024

Extra labels, such as an additional `device` label on one metric, will also cause the series to fail to match.

We need a way to tell Prometheus: ignore some labels when matching, and only match on the ones that matter. PromQL gives us two keywords for this: on and ignoring.

`on` — Match only on specified labels

on tells Prometheus to match series using only the labels we list. All other labels are ignored during the pairing step.

Let's compute the percentage of total memory used by the kernel's buffer cache. We need node_memory_Buffers_bytes divided by node_memory_MemTotal_bytes. Both metrics have instance and job labels but nothing else in common — that is fine, because those two labels are enough to pair them:

node_memory_Buffers_bytes{instance="172.31.33.131:9100"}
/
on(instance, job)
node_memory_MemTotal_bytes{instance="172.31.33.131:9100"}

We'll run this. Prometheus matches the two series on instance and job only and divides their values. We get the fraction of total memory occupied by buffers. The result carries only the instance and job labels — the on clause drops all others from the output.

`ignoring` — Match on all labels except the ones listed

ignoring tells Prometheus to match series on all labels except the ones we list. It is the inverse of on.
Node Exporter exposes node_filesystem_avail_bytes and node_filesystem_size_bytes. Both carry the same labels except — in some environments — a minor difference in a rarely used label. Let's say we want to divide them but we know they differ on one label we do not care about. We'll use ignoring to exclude it:

node_filesystem_avail_bytes{instance="172.31.33.131:9100", fstype!~"tmpfs|squashfs|devtmpfs"}
/
ignoring(job)
node_filesystem_size_bytes{instance="172.31.33.131:9100", fstype!~"tmpfs|squashfs|devtmpfs"}

We'll run this. Prometheus matches on every label except job and performs the division. This is useful when two related metrics are scraped from different jobs but represent the same resource.

on vs ignoring — which to use? If the two metrics share only a few meaningful labels, use on and list exactly those labels. If they share almost all labels and differ on just one or two, use ignoring and list only the exceptions. The goal is always to make the pairing rule explicit and unambiguous.

Many-to-One and One-to-Many Matching

One-to-one matching requires each series on the left to pair with at most one series on the right, and vice versa. When that is not the case — when one series on one side must match multiple series on the other — we need group_left or group_right.
This is called many-to-one matching.

`group_left` — The left side has many series per match

group_left tells Prometheus that one series on the right will match many series on the left. The result carries the labels of the left side.

The most common use case with Node Exporter: we want to attach a label from a metadata metric to a per-mode or per-device metric. Let's attach the nodename label from node_uname_info to our CPU metrics:

node_cpu_seconds_total{instance="172.31.33.131:9100", mode="user"}
* on(instance, job) group_left(nodename)
node_uname_info{instance="172.31.33.131:9100"}

We'll run this. node_uname_info has one series per host — it carries nodename, release, version, and other system labels.

node_cpu_seconds_total has one series per CPU core per mode.

The group_left keyword allows the single node_uname_info series to match all the CPU series that share the same instance and job. The result carries all the CPU series labels plus the nodename label we listed in group_left().

We will use this pattern in Grafana to display the hostname alongside metric values without joining on a separate query.

`group_right` — The right side has many series per match

group_right is the mirror of group_left. The right side has many series and the left side has one. The result carries the labels of the right side.

node_uname_info{instance="172.31.33.131:9100"}
* on(instance, job) group_right()
node_cpu_seconds_total{instance="172.31.33.131:9100", mode="user"}

This produces the same result as the group_left example above. The choice between group_left and group_right depends only on which side holds the "one" and which holds the "many". If the many series are on the left, use group_left. If they are on the right, use group_right.

Operator Precedence

When we chain multiple operators in a single expression, Prometheus evaluates them in a fixed order. Higher precedence operators evaluate first.

From highest to lowest precedence:

Precedence	Operators
1 (highest)	`^`
2	`*` `/` `%`
3	`+` `-`
4	"== != < > <= >="
5	`and` `unless`
6 (lowest)	`or`

Operators at the same level evaluate left to right, except ^ which evaluates right to left.
Let's see why this matters. Our disk usage percentage query is:

(
  node_filesystem_size_bytes{instance="172.31.33.131:9100", fstype!~"tmpfs|squashfs|devtmpfs"}
  -
  node_filesystem_avail_bytes{instance="172.31.33.131:9100", fstype!~"tmpfs|squashfs|devtmpfs"}
)
/
node_filesystem_size_bytes{instance="172.31.33.131:9100", fstype!~"tmpfs|squashfs|devtmpfs"}
* 100

Without the parentheses around the subtraction, Prometheus would evaluate * before - because multiplication has higher precedence. We would divide avail_bytes by size_bytes first, then multiply by 100, then subtract from size_bytes — a completely wrong result.

Parentheses override precedence. We should use them liberally in any expression with more than two operators. An expression that is correct and hard to read is worse than one that is correct and obvious.

Building a Query Step by Step

Let's use everything from this post to build a production query from scratch. We want to know: what percentage of each CPU core is actively working — excluding idle and iowait time.
We will build this step by step in the Prometheus UI.

Step 1 — Start with the raw metric and inspect the labels:

node_cpu_seconds_total{instance="172.31.33.131:9100"}

We will see every series — all modes, all cores. We note the label names: cpu, mode, instance, job.

Step 2 — Exclude idle and iowait modes:

node_cpu_seconds_total{instance="172.31.33.131:9100", mode!~"idle|iowait"}

We now see only the active modes: user, system, irq, softirq, steal, nice.

Step 3 — Get total CPU seconds across all modes for each core:
We need the denominator — total CPU time per core regardless of mode. We will cover sum() in detail in the next post, but let's use it briefly here to complete the query:

sum without(mode) (node_cpu_seconds_total{instance="172.31.33.131:9100"})

This collapses all modes into one total per CPU core. We will explain sum without() fully in the next post.

Step 4 — Divide active time by total time and express as a percentage:

sum without(mode) (
  node_cpu_seconds_total{instance="172.31.33.131:9100", mode!~"idle|iowait"}
)
/
sum without(mode) (
  node_cpu_seconds_total{instance="172.31.33.131:9100"}
)
* 100

We'll run this. We will see one result per CPU core — the percentage of time each core spends on active work. On an idle host, we will see values close to zero. Under a heavy MuleSoft workload, we will see them climb toward 100.

The division here is a one-to-one match on the cpu, instance, and job labels — both sides carry the same set after the sum without(mode) collapses the mode label. No on or ignoring is needed.

Vector Matching Quick Reference

Keyword	Purpose
(default)	Match on all labels — one-to-one
`on(l1, l2)`	Match only on the listed labels
`ignoring(l1)`	Match on all labels except the listed ones
`group_left(l1)`	Many left series per right series — carry left labels, optionally import listed labels from right
`group_right(l1)`	Many right series per left series — carry right labels, optionally import listed labels from left

Summary

By default, Prometheus matches two instant vectors on all shared labels — one-to-one. We use on to restrict matching to specific labels and ignoring to exclude specific labels from the match. We use group_left and group_right when one series on one side must match many series on the other. Operator precedence controls evaluation order — multiplication and division evaluate before addition and subtraction, which evaluate before comparisons, which evaluate before logical operators. Parentheses override precedence and we should use them freely to make intent explicit.
In the next post, we will learn about aggregation operators — sum(), avg(), max(), min(), and count() — and how to control their output with the by and without clauses.

PromQL - Vector Matching

How Prometheus Matches Two Vectors

When Default Matching Breaks

`on` — Match only on specified labels

`ignoring` — Match on all labels except the ones listed

Many-to-One and One-to-Many Matching

`group_left` — The left side has many series per match

`group_right` — The right side has many series per match

Operator Precedence

Building a Query Step by Step

Vector Matching Quick Reference

Summary

How to plan a (successful) VPN migration - Part I

Contact Form

PromQL - Vector Matching

How Prometheus Matches Two Vectors

When Default Matching Breaks

on — Match only on specified labels

ignoring — Match on all labels except the ones listed

Many-to-One and One-to-Many Matching

group_left — The left side has many series per match

group_right — The right side has many series per match

Operator Precedence

Building a Query Step by Step

Vector Matching Quick Reference

Summary

Contact Form

`on` — Match only on specified labels

`ignoring` — Match on all labels except the ones listed

`group_left` — The left side has many series per match

`group_right` — The right side has many series per match