In last post How to install Node Exporter for Prometheus on Ubuntu Server, we ran Node Exporter as a foreground process. That approach works for testing, but it has three hard problems in production:
- No automatic start. If the server reboots, Node Exporter does not restart. We lose metrics visibility until someone manually restarts it.
- No crash recovery. If the process dies, nothing brings it back.
- No process supervision. We have no standard way to start, stop, reload, or check the status of the exporter across environments.
We need a process manager — a system-level tool that owns the lifecycle of Node Exporter. On modern Ubuntu systems, that tool is systemd.
Why systemd Is the Right Choice for Production
systemd is the default init system and service manager on Ubuntu 20.04 and 22.04. Every service we want to run reliably in production — Prometheus, Alertmanager, Mule runtime — should have a systemd unit file.
Here is what systemd gives us:
| Capability | What It Means for Us |
|---|---|
| Auto-start on boot | Node Exporter starts automatically after every reboot |
| Restart on failure | If the process crashes, systemd restarts it automatically |
| Dependency ordering | We can declare that Node Exporter starts after the network is up |
| Centralized logging | All output goes to journald, queryable with journalctl |
| Standard controls | systemctl start, stop, restart, status work the same everywhere |
| Security hardening | systemd unit files support sandboxing and privilege restrictions |
In a Mule observability stack, we need Node Exporter running at all times. A missed scrape interval is a gap in our metrics. A gap can hide the exact moment a Mule worker runs out of memory or a disk fills up. systemd closes that gap.
Production Node Exporter Installation with systemd
This post is the second part of our last post How to install Node Exporter for Prometheus on Ubuntu Server. We assume the Node Exporter binary is already installed at /usr/local/bin/node_exporter and the node_exporter system user exists. If not, complete Steps 1–3 from our previous first, then return here.
Prerequisites
- Ubuntu 20.04 or 22.04 server
- Node Exporter binary at
/usr/local/bin/node_exporter - System user
node_exportercreated (no home directory, no login shell) sudoprivileges
Step 1 — Verify the Binary and User
Before we write the service file, confirm both pieces are in place:
ls -l /usr/local/bin/node_exporter
Expected output:
Confirm the user exists:
id node_exporter
Expected output:
uid=1001(node_exporter) gid=1001(node_exporter) groups=1001(node_exporter)
If either check fails, go back to Steps 1–3 of previous post.
Step 2 — Create the systemd Unit File
A systemd unit file describes how to run a service. We create one for Node Exporter at /etc/systemd/system/node_exporter.service.
sudo vi /etc/systemd/system/node_exporter.service
Paste the following content:
[Unit]
Description=Prometheus Node Exporter
Documentation=https://prometheus.io/docs/guides/node-exporter/
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
Restart=on-failure
RestartSec=5s
ExecStart=/usr/local/bin/node_exporter \
--web.listen-address=:9100 \
--collector.disable-defaults \
--collector.cpu \
--collector.meminfo \
--collector.diskstats \
--collector.filesystem \
--collector.netdev \
--collector.loadavg \
--collector.uname
# Security hardening
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=
PrivateTmp=yes
[Install]
WantedBy=multi-user.target
Save and close the file.
Understanding the Unit File
Let us walk through the important parts.
[Unit] section
Wants=network-online.targetandAfter=network-online.target— Node Exporter needs a network interface before it starts. Prometheus scrapes it over the network. We tellsystemdto wait until the network is up before launching the process.Documentation— a direct link to the Node Exporter docs. This helps the next engineer who reads this file.
[Service] section
User=node_exporterandGroup=node_exporter— the process runs as the least-privileged user we created. It has no ability to write to sensitive paths or escalate privileges.Type=simple—systemdtreats the process as ready as soon as it starts. This is correct for Node Exporter, which does not fork or use sockets managed bysystemd.Restart=on-failure— if Node Exporter exits with a non-zero code,systemdrestarts it automatically.RestartSec=5s— wait 5 seconds before restarting. This prevents a rapid restart loop if the process is crashing immediately.ExecStart— the command to launch Node Exporter.
Collector flags in ExecStart
Node Exporter ships with many collectors enabled by default. In production, we enable only what we need. This reduces memory usage, shortens scrape time, and limits attack surface.
| Flag | Metrics It Enables |
|---|---|
--collector.cpu | CPU time per mode per core |
--collector.meminfo | RAM and swap usage |
--collector.diskstats | Disk read/write throughput and IOPS |
--collector.filesystem | Filesystem capacity and usage |
--collector.netdev | Network interface bytes and packets |
--collector.loadavg | 1, 5, and 15-minute load averages |
--collector.uname | OS version and kernel info |
Add or remove collectors based on our monitoring requirements. For a Mule host, these seven collectors cover the metrics that matter most.
Security hardening directives
NoNewPrivileges=yes— the process cannot gain elevated privileges after start, even viasetuidbinaries.ProtectSystem=strict— the entire filesystem is read-only from the process's perspective, except paths we explicitly allow.ProtectHome=yes— the process cannot read/home,/root, or/run/user.PrivateTmp=yes— the process gets its own isolated/tmp. It cannot read temp files from other processes.
These are standard systemd hardening options. They cost nothing at runtime and substantially reduce the blast radius if Node Exporter is ever exploited.[Install] section
WantedBy=multi-user.target— when we enable the service,systemdadds it to the normal multi-user boot target. It starts automatically on every reboot.
Step 3 — Reload systemd and Enable the Service
After creating a new unit file, we must reload the systemd daemon so it reads our new file:
sudo systemctl daemon-reload
Enable the service so it starts at boot:
sudo systemctl enable node_exporter
Start it now:
sudo systemctl start node_exporter
Step 4 — Verify the Service Is Running
Check the service status:
sudo systemctl status node_exporter
We expect output like this:
Key things to confirm:
Active: active (running)— the process is running.enabledin theLoadedline — the service will start at boot.Main PIDis populated —systemdowns the process.
If the status shows failed, read the logs:
sudo journalctl -u node_exporter -n 50 --no-pager
This command pulls the last 50 log lines for the node_exporter unit. journald captures everything the process writes to stdout and stderr.
Step 5 — Test the Metrics Endpoint
Confirm Node Exporter is serving metrics:
curl http://localhost:9100/metrics | head -20
We will see the first 20 lines of the metrics output. Since we used --collector.disable-defaults and enabled only specific collectors, the output is leaner than the default installation.
To count the number of metrics currently exposed:
curl -s http://localhost:9100/metrics | grep -v '^#' | wc -l
A lean production configuration typically exposes between 150 and 300 metrics, compared to 800+ with all defaults enabled.
Step 6 — Test Automatic Restart
Let us prove that systemd restarts Node Exporter after a crash. Find the process ID:
sudo systemctl status node_exporter | grep "Main PID"
Kill the process with a hard signal:
sudo kill -9 <MAIN_PID>
Wait 6 seconds (our RestartSec=5s plus one), then check the status again:
sudo systemctl status node_exporter
The Main PID will be a new number. The restart count in the status output will show 1 restart. systemd detected the crash and brought the process back automatically.
Step 7 — Test Startup After Reboot
Enable the service if we have not done so already:
sudo systemctl is-enabled node_exporter
This should return enabled. Now reboot the machine:
sudo reboot
After the machine comes back online, SSH in and check the status:
sudo systemctl status node_exporter
Node Exporter starts automatically with the system. We do not need to start it manually.
Step 8 — Add the Target to Prometheus
This step is identical to Post 3 Step 7. On the Prometheus server, edit /etc/prometheus/prometheus.yml and confirm the node_exporter job is present:
- job_name: 'node_exporter'
static_configs:
- targets: ['<NODE_EXPORTER_HOST_IP>:9100']
Reload Prometheus:
sudo systemctl reload prometheus
Open the Prometheus UI at http://<PROMETHEUS_SERVER_IP>:9090/targets and confirm the target shows UP.
Step 9 — Common Operational Commands
Here are the commands we will use most often when managing Node Exporter in production:
# Check current status
sudo systemctl status node_exporter
# Stop the service
sudo systemctl stop node_exporter
# Restart the service (after a config change)
sudo systemctl restart node_exporter
# View live logs
sudo journalctl -u node_exporter -f
# View logs from the last boot
sudo journalctl -u node_exporter -b
# Disable the service (prevents start at boot)
sudo systemctl disable node_exporter
All of these commands work the same way for Prometheus, Alertmanager, and every other systemd-managed service in our stack. We use one consistent interface for the entire observability platform.
Summary
We turned a simple binary process into a production-grade service. Our Node Exporter now:
- Starts automatically on every boot
- Restarts automatically after any failure
- Runs as an unprivileged user with a hardened filesystem sandbox
- Exposes only the collectors our Mule monitoring stack needs
- Logs to
journaldwith standard tooling for inspection
This is the right baseline for any exporter we deploy in a Mule production environment. In the next post, we will use the data Node Exporter provides to build our first Prometheus alerting rules for host-level resource saturation.