FreeBSD System Monitoring with Prometheus: Complete Review

Prometheus is an open-source monitoring and alerting toolkit originally built at SoundCloud in 2012 and now graduated from the Cloud Native Computing Foundation (CNCF). It uses a pull-based model to scrape metrics from instrumented targets at defined intervals, stores them in a local time-series database, and provides a powerful query language called PromQL for analysis. For FreeBSD administrators, Prometheus is one of the best options for monitoring system health, ZFS pool status, network throughput, and service availability across fleets of servers.

This review covers the Prometheus architecture, installation on FreeBSD via pkg, configuring node_exporter with FreeBSD-specific and ZFS collectors, writing PromQL queries, setting up alerting rules, and comparing Prometheus with Zabbix and Netdata.

Prometheus Architecture

Prometheus follows a simple but effective architecture. Understanding its components is essential before installation.

Prometheus Server is the core. It scrapes metrics from configured targets over HTTP, stores them in a time-series database on local disk, evaluates alerting rules, and serves the PromQL API. It does not rely on external databases -- everything runs from a single binary with local storage.

Exporters are lightweight agents that expose metrics in Prometheus format on HTTP endpoints. The most important for FreeBSD is node_exporter, which exposes hardware and OS-level metrics. Other exporters exist for databases (postgres_exporter, mysqld_exporter), web servers (nginx-prometheus-exporter), and hundreds of other services.

Alertmanager receives firing alerts from Prometheus and routes them to notification channels -- email, Slack, PagerDuty, webhooks, or custom integrations. It handles deduplication, grouping, silencing, and inhibition.

Pushgateway is an optional component for short-lived jobs that cannot be scraped. The job pushes metrics to the Pushgateway, and Prometheus scrapes the gateway. Use this sparingly; the pull model is the default and preferred approach.

Service Discovery allows Prometheus to find targets dynamically through DNS, Consul, file-based discovery, or cloud provider APIs. For static FreeBSD infrastructure, file-based discovery or static configuration in prometheus.yml is typical.

The data flow is straightforward: exporters expose /metrics endpoints, Prometheus scrapes them on a schedule, stores the data, evaluates rules, and fires alerts to Alertmanager when thresholds are breached.

Installing Prometheus on FreeBSD

Prometheus Server

sh
pkg install prometheus

This installs the Prometheus server binary at /usr/local/bin/prometheus and the default configuration at /usr/local/etc/prometheus.yml.

Enable and start the service:

sh
sysrc prometheus_enable="YES"
service prometheus start

Prometheus listens on port 9090 by default. Verify it is running:

sh
fetch -qo - http://localhost:9090/-/healthy

You should see Prometheus Server is Healthy.

Node Exporter

sh
pkg install node_exporter
sysrc node_exporter_enable="YES"
service node_exporter start

Node exporter listens on port 9100. Verify metrics are being exposed:

sh
fetch -qo - http://localhost:9100/metrics | head -20

Alertmanager

sh
pkg install alertmanager
sysrc alertmanager_enable="YES"
service alertmanager start

Alertmanager listens on port 9093.

Configuring Prometheus

The main configuration file is /usr/local/etc/prometheus.yml. Here is a production-oriented starting configuration:

sh
# /usr/local/etc/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  scrape_timeout: 10s

alerting:
  alertmanagers:
    - static_configs:
        - targets:
            - "localhost:9093"

rule_files:
  - "/usr/local/etc/prometheus/rules/*.yml"

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "freebsd-nodes"
    static_configs:
      - targets:
          - "localhost:9100"
          - "server2.example.com:9100"
          - "server3.example.com:9100"
    scrape_interval: 30s

  - job_name: "freebsd-nodes-file"
    file_sd_configs:
      - files:
          - "/usr/local/etc/prometheus/targets/*.json"
        refresh_interval: 5m

File-based service discovery uses JSON files like this:

sh
# /usr/local/etc/prometheus/targets/freebsd-servers.json
[
  {
    "targets": ["db1.example.com:9100", "db2.example.com:9100"],
    "labels": {
      "role": "database",
      "datacenter": "us-east"
    }
  }
]

After editing the config, validate and reload:

sh
promtool check config /usr/local/etc/prometheus.yml
service prometheus reload

Node Exporter with ZFS Collectors

The default node_exporter on FreeBSD enables a set of OS-specific collectors. To get full ZFS visibility, you need to confirm the right collectors are active.

Enabling ZFS Collectors

On FreeBSD, pass collector flags via sysrc:

sh
sysrc node_exporter_args="--collector.zfs --collector.cpu --collector.diskstats --collector.filesystem --collector.loadavg --collector.meminfo --collector.netdev --collector.uname --web.listen-address=:9100"
service node_exporter restart

Key ZFS Metrics

With the ZFS collector enabled, node_exporter exposes metrics including:

node_zfs_arc_size -- Current ARC size in bytes
node_zfs_arc_hits_total -- ARC cache hit count
node_zfs_arc_misses_total -- ARC cache miss count
node_zfs_arc_mfu_size -- MFU (Most Frequently Used) cache size
node_zfs_arc_mru_size -- MRU (Most Recently Used) cache size
node_zfs_abd_linear_cnt -- ABD linear buffer count
node_zfs_abd_scatter_cnt -- ABD scatter buffer count

For zpool-level metrics, you may need a dedicated ZFS exporter or a textfile collector. One effective approach is a cron job that writes zpool status to a textfile:

sh
#!/bin/sh
# /usr/local/etc/prometheus/textfile/zpool_status.sh
OUTPUT="/var/tmp/node_exporter/zpool.prom"
mkdir -p /var/tmp/node_exporter

zpool list -Hp | while IFS=$'\t' read name size alloc free frag cap dedup health altroot; do
  health_val=0
  [ "$health" = "ONLINE" ] && health_val=1
  echo "zpool_health{pool=\"$name\"} $health_val"
  echo "zpool_size_bytes{pool=\"$name\"} $size"
  echo "zpool_allocated_bytes{pool=\"$name\"} $alloc"
  echo "zpool_free_bytes{pool=\"$name\"} $free"
  echo "zpool_fragmentation_percent{pool=\"$name\"} $frag"
done > "$OUTPUT.tmp"
mv "$OUTPUT.tmp" "$OUTPUT"

Add the textfile collector path to node_exporter:

sh
sysrc node_exporter_args="--collector.zfs --collector.textfile --collector.textfile.directory=/var/tmp/node_exporter --web.listen-address=:9100"
service node_exporter restart

Schedule the script:

sh
# Add to /etc/crontab
*/2 * * * * root /usr/local/etc/prometheus/textfile/zpool_status.sh

PromQL: Querying FreeBSD Metrics

PromQL is Prometheus's query language. It is functional, expressive, and central to building dashboards and alerts. Here are queries relevant to FreeBSD monitoring.

CPU Usage

sh
# Per-CPU utilization (excluding idle)
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Memory

sh
# Memory utilization percentage
(1 - (node_memory_free_bytes + node_memory_inactive_bytes) / node_memory_total_bytes) * 100

ZFS ARC Hit Rate

sh
# ARC hit ratio over 5 minutes
rate(node_zfs_arc_hits_total[5m]) / (rate(node_zfs_arc_hits_total[5m]) + rate(node_zfs_arc_misses_total[5m])) * 100

Disk I/O

sh
# Read throughput per device
rate(node_disk_read_bytes_total[5m])

Network Traffic

sh
# Inbound traffic per interface in Mbps
rate(node_network_receive_bytes_total{device!="lo0"}[5m]) * 8 / 1000000

Filesystem Usage

sh
# Filesystem usage above 80%
node_filesystem_avail_bytes / node_filesystem_size_bytes < 0.2

Uptime

sh
# System uptime in days
(time() - node_boot_time_seconds) / 86400

Alerting Rules

Create alerting rules in /usr/local/etc/prometheus/rules/freebsd.yml:

sh
# /usr/local/etc/prometheus/rules/freebsd.yml
groups:
  - name: freebsd-system
    rules:
      - alert: HighCPU
        expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage on {{ $labels.instance }}"
          description: "CPU usage is above 90% for more than 10 minutes."

      - alert: HighMemory
        expr: (1 - (node_memory_free_bytes + node_memory_inactive_bytes) / node_memory_total_bytes) * 100 > 90
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage on {{ $labels.instance }}"

      - alert: ZpoolDegraded
        expr: zpool_health == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "ZFS pool {{ $labels.pool }} is degraded on {{ $labels.instance }}"

      - alert: DiskSpaceLow
        expr: node_filesystem_avail_bytes / node_filesystem_size_bytes < 0.1
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Less than 10% disk space on {{ $labels.instance }} mount {{ $labels.mountpoint }}"

      - alert: InstanceDown
        expr: up == 0
        for: 3m
        labels:
          severity: critical
        annotations:
          summary: "Instance {{ $labels.instance }} is unreachable"

Validate the rules:

sh
promtool check rules /usr/local/etc/prometheus/rules/freebsd.yml
service prometheus reload

Alertmanager Configuration

Configure Alertmanager at /usr/local/etc/alertmanager/alertmanager.yml:

sh
# /usr/local/etc/alertmanager/alertmanager.yml
global:
  smtp_smarthost: "mail.example.com:587"
  smtp_from: "prometheus@example.com"
  smtp_auth_username: "prometheus@example.com"
  smtp_auth_password: "secret"

route:
  group_by: ["alertname", "instance"]
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: "email-team"
  routes:
    - match:
        severity: critical
      receiver: "email-oncall"

receivers:
  - name: "email-team"
    email_configs:
      - to: "team@example.com"

  - name: "email-oncall"
    email_configs:
      - to: "oncall@example.com"

Reload after changes:

sh
service alertmanager reload

Storage and Retention

Prometheus stores data on local disk using its custom TSDB format. Default retention is 15 days. Adjust with startup flags:

sh
sysrc prometheus_args="--storage.tsdb.retention.time=90d --storage.tsdb.path=/var/db/prometheus"
service prometheus restart

On FreeBSD with ZFS, place the Prometheus data directory on a dedicated dataset with appropriate recordsize:

sh
zfs create -o recordsize=1M -o compression=lz4 -o atime=off zroot/prometheus

The 1M recordsize aligns well with TSDB block files. Compression with lz4 is beneficial since metric data compresses well.

For long-term storage beyond what local disk provides, consider Thanos or Cortex as remote storage backends. Both integrate with Prometheus's remote write API.

Security Considerations

Prometheus has no built-in authentication. On FreeBSD, secure it with PF rules and reverse proxy authentication.

Restrict access at the firewall level:

sh
# In /etc/pf.conf
# Only allow monitoring network to reach Prometheus
pass in on egress proto tcp from 10.0.1.0/24 to any port { 9090 9093 9100 }
block in on egress proto tcp to any port { 9090 9093 9100 }

For web authentication, place nginx in front of Prometheus with basic auth:

sh
pkg install nginx

Configure a reverse proxy with authentication in /usr/local/etc/nginx/nginx.conf:

sh
server {
    listen 443 ssl;
    server_name prometheus.example.com;

    ssl_certificate /usr/local/etc/ssl/prometheus.crt;
    ssl_certificate_key /usr/local/etc/ssl/prometheus.key;

    auth_basic "Prometheus";
    auth_basic_user_file /usr/local/etc/nginx/.htpasswd;

    location / {
        proxy_pass http://127.0.0.1:9090;
    }
}

Prometheus 2.x also supports native TLS and basic auth via a web configuration file:

sh
# /usr/local/etc/prometheus/web.yml
tls_server_config:
  cert_file: /usr/local/etc/ssl/prometheus.crt
  key_file: /usr/local/etc/ssl/prometheus.key
basic_auth_users:
  admin: $2y$10$hashed_bcrypt_password_here

Enable it:

sh
sysrc prometheus_args="--web.config.file=/usr/local/etc/prometheus/web.yml --storage.tsdb.retention.time=90d"
service prometheus restart

Prometheus vs Zabbix vs Netdata

Zabbix

Zabbix is a traditional enterprise monitoring platform with an agent-based architecture, SQL database backend (PostgreSQL or MySQL), and a PHP web frontend. It has been around since 2001 and is widely used in enterprise environments.

Architecture: Zabbix uses a push model where agents send data to a central server. This is fundamentally different from Prometheus's pull model. Zabbix requires a relational database, which adds operational complexity but supports historical data natively.

Strengths over Prometheus: Zabbix has better out-of-box auto-discovery, built-in authentication and RBAC, native agent-based monitoring without external exporters, and excellent template support for network equipment (SNMP). Its web UI is self-contained.

Weaknesses: Zabbix is heavier to run, its query language is limited compared to PromQL, scaling requires database optimization, and it does not integrate naturally with the Grafana/cloud-native ecosystem. On FreeBSD, the Zabbix server package exists but the web frontend requires a full PHP/Apache or nginx stack.

Netdata

Netdata is a real-time monitoring agent that emphasizes per-second granularity and zero-configuration auto-detection of metrics. It runs a web dashboard on each monitored node.

Architecture: Netdata runs an agent per node that collects metrics every second and serves a local dashboard. Netdata Cloud provides a centralized view. It is designed for real-time troubleshooting rather than historical analysis.

Strengths over Prometheus: Netdata provides per-second resolution out of the box, auto-detects most services without configuration, and has a polished built-in dashboard. Installation on FreeBSD is straightforward with pkg install netdata.

Weaknesses: Netdata's per-second collection consumes more CPU and memory. Its long-term storage capabilities are limited compared to Prometheus TSDB. It lacks the mature alerting pipeline of Prometheus + Alertmanager, and PromQL has no equivalent in Netdata.

When to Use Each

Prometheus: Best for infrastructure with many nodes, when you need PromQL for complex queries, when integrating with Grafana, or when running Kubernetes alongside FreeBSD.
Zabbix: Best for traditional enterprise environments with SNMP devices, when built-in RBAC is required, or when templates for network equipment are needed.
Netdata: Best for real-time troubleshooting on individual servers, quick deployment without configuration, or when per-second visibility is critical.

For most FreeBSD server fleets, Prometheus paired with Grafana offers the best balance of flexibility, query power, and ecosystem integration.

FAQ

How much disk space does Prometheus use on FreeBSD?

Prometheus uses roughly 1-2 bytes per sample in its TSDB. A typical FreeBSD server with node_exporter generates around 500-1000 time series. At a 15-second scrape interval, that is about 5-10 MB per day per server. For 100 servers over 90 days, expect approximately 50-90 GB of storage.

Can Prometheus monitor FreeBSD jails individually?

Yes. Run node_exporter inside each jail with a unique port, or use a single node_exporter on the host with labels to distinguish jails. For jail-specific metrics, the textfile collector approach works well -- run a script inside the jail that writes metrics to a shared directory.

Does node_exporter work with ZFS on FreeBSD?

Yes. The ZFS collector is available on FreeBSD builds of node_exporter. It exposes ARC statistics, L2ARC metrics, and dataset-level information. For zpool health and capacity, supplement with a textfile collector script as shown above.

How do I upgrade Prometheus on FreeBSD?

sh
pkg upgrade prometheus node_exporter alertmanager
service prometheus restart
service node_exporter restart
service alertmanager restart

Prometheus supports backward-compatible TSDB storage, so upgrades rarely require data migration.

Can Prometheus scrape targets behind a NAT or firewall?

Prometheus requires network access to each target's /metrics endpoint. If targets are behind NAT, options include: running Prometheus inside the private network, using a VPN or SSH tunnel, or deploying Pushgateway for targets that cannot be scraped directly. On FreeBSD, a WireGuard tunnel between Prometheus and remote targets is a clean solution.

How do I back up Prometheus data on FreeBSD?

Use the Prometheus snapshot API:

sh
curl -XPOST http://localhost:9090/api/v1/admin/tsdb/snapshot

This creates a snapshot in the data directory. On ZFS, you can also use ZFS snapshots for instant, consistent backups:

sh
zfs snapshot zroot/prometheus@backup-$(date +%Y%m%d)

Is Prometheus suitable for monitoring 1000+ FreeBSD servers?

Yes, but at that scale, consider federation or Thanos. A single Prometheus instance can handle several hundred targets with default scrape intervals. For larger deployments, use hierarchical federation where regional Prometheus servers aggregate into a global instance, or deploy Thanos for unified querying across multiple Prometheus instances.