FreeBSD System Monitoring with Prometheus: Complete Review
Prometheus is an open-source monitoring and alerting toolkit originally built at SoundCloud in 2012 and now graduated from the Cloud Native Computing Foundation (CNCF). It uses a pull-based model to scrape metrics from instrumented targets at defined intervals, stores them in a local time-series database, and provides a powerful query language called PromQL for analysis. For FreeBSD administrators, Prometheus is one of the best options for monitoring system health, ZFS pool status, network throughput, and service availability across fleets of servers.
This review covers the Prometheus architecture, installation on FreeBSD via pkg, configuring node_exporter with FreeBSD-specific and ZFS collectors, writing PromQL queries, setting up alerting rules, and comparing Prometheus with Zabbix and Netdata.
Prometheus Architecture
Prometheus follows a simple but effective architecture. Understanding its components is essential before installation.
Prometheus Server is the core. It scrapes metrics from configured targets over HTTP, stores them in a time-series database on local disk, evaluates alerting rules, and serves the PromQL API. It does not rely on external databases -- everything runs from a single binary with local storage.
Exporters are lightweight agents that expose metrics in Prometheus format on HTTP endpoints. The most important for FreeBSD is node_exporter, which exposes hardware and OS-level metrics. Other exporters exist for databases (postgres_exporter, mysqld_exporter), web servers (nginx-prometheus-exporter), and hundreds of other services.
Alertmanager receives firing alerts from Prometheus and routes them to notification channels -- email, Slack, PagerDuty, webhooks, or custom integrations. It handles deduplication, grouping, silencing, and inhibition.
Pushgateway is an optional component for short-lived jobs that cannot be scraped. The job pushes metrics to the Pushgateway, and Prometheus scrapes the gateway. Use this sparingly; the pull model is the default and preferred approach.
Service Discovery allows Prometheus to find targets dynamically through DNS, Consul, file-based discovery, or cloud provider APIs. For static FreeBSD infrastructure, file-based discovery or static configuration in prometheus.yml is typical.
The data flow is straightforward: exporters expose /metrics endpoints, Prometheus scrapes them on a schedule, stores the data, evaluates rules, and fires alerts to Alertmanager when thresholds are breached.
Installing Prometheus on FreeBSD
Prometheus Server
shpkg install prometheus
This installs the Prometheus server binary at /usr/local/bin/prometheus and the default configuration at /usr/local/etc/prometheus.yml.
Enable and start the service:
shsysrc prometheus_enable="YES" service prometheus start
Prometheus listens on port 9090 by default. Verify it is running:
shfetch -qo - http://localhost:9090/-/healthy
You should see Prometheus Server is Healthy.
Node Exporter
shpkg install node_exporter sysrc node_exporter_enable="YES" service node_exporter start
Node exporter listens on port 9100. Verify metrics are being exposed:
shfetch -qo - http://localhost:9100/metrics | head -20
Alertmanager
shpkg install alertmanager sysrc alertmanager_enable="YES" service alertmanager start
Alertmanager listens on port 9093.
Configuring Prometheus
The main configuration file is /usr/local/etc/prometheus.yml. Here is a production-oriented starting configuration:
sh# /usr/local/etc/prometheus.yml global: scrape_interval: 15s evaluation_interval: 15s scrape_timeout: 10s alerting: alertmanagers: - static_configs: - targets: - "localhost:9093" rule_files: - "/usr/local/etc/prometheus/rules/*.yml" scrape_configs: - job_name: "prometheus" static_configs: - targets: ["localhost:9090"] - job_name: "freebsd-nodes" static_configs: - targets: - "localhost:9100" - "server2.example.com:9100" - "server3.example.com:9100" scrape_interval: 30s - job_name: "freebsd-nodes-file" file_sd_configs: - files: - "/usr/local/etc/prometheus/targets/*.json" refresh_interval: 5m
File-based service discovery uses JSON files like this:
sh# /usr/local/etc/prometheus/targets/freebsd-servers.json [ { "targets": ["db1.example.com:9100", "db2.example.com:9100"], "labels": { "role": "database", "datacenter": "us-east" } } ]
After editing the config, validate and reload:
shpromtool check config /usr/local/etc/prometheus.yml service prometheus reload
Node Exporter with ZFS Collectors
The default node_exporter on FreeBSD enables a set of OS-specific collectors. To get full ZFS visibility, you need to confirm the right collectors are active.
Enabling ZFS Collectors
On FreeBSD, pass collector flags via sysrc:
shsysrc node_exporter_args="--collector.zfs --collector.cpu --collector.diskstats --collector.filesystem --collector.loadavg --collector.meminfo --collector.netdev --collector.uname --web.listen-address=:9100" service node_exporter restart
Key ZFS Metrics
With the ZFS collector enabled, node_exporter exposes metrics including:
node_zfs_arc_size-- Current ARC size in bytesnode_zfs_arc_hits_total-- ARC cache hit countnode_zfs_arc_misses_total-- ARC cache miss countnode_zfs_arc_mfu_size-- MFU (Most Frequently Used) cache sizenode_zfs_arc_mru_size-- MRU (Most Recently Used) cache sizenode_zfs_abd_linear_cnt-- ABD linear buffer countnode_zfs_abd_scatter_cnt-- ABD scatter buffer count
For zpool-level metrics, you may need a dedicated ZFS exporter or a textfile collector. One effective approach is a cron job that writes zpool status to a textfile:
sh#!/bin/sh # /usr/local/etc/prometheus/textfile/zpool_status.sh OUTPUT="/var/tmp/node_exporter/zpool.prom" mkdir -p /var/tmp/node_exporter zpool list -Hp | while IFS=$'\t' read name size alloc free frag cap dedup health altroot; do health_val=0 [ "$health" = "ONLINE" ] && health_val=1 echo "zpool_health{pool=\"$name\"} $health_val" echo "zpool_size_bytes{pool=\"$name\"} $size" echo "zpool_allocated_bytes{pool=\"$name\"} $alloc" echo "zpool_free_bytes{pool=\"$name\"} $free" echo "zpool_fragmentation_percent{pool=\"$name\"} $frag" done > "$OUTPUT.tmp" mv "$OUTPUT.tmp" "$OUTPUT"
Add the textfile collector path to node_exporter:
shsysrc node_exporter_args="--collector.zfs --collector.textfile --collector.textfile.directory=/var/tmp/node_exporter --web.listen-address=:9100" service node_exporter restart
Schedule the script:
sh# Add to /etc/crontab */2 * * * * root /usr/local/etc/prometheus/textfile/zpool_status.sh
PromQL: Querying FreeBSD Metrics
PromQL is Prometheus's query language. It is functional, expressive, and central to building dashboards and alerts. Here are queries relevant to FreeBSD monitoring.
CPU Usage
sh# Per-CPU utilization (excluding idle) 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
Memory
sh# Memory utilization percentage (1 - (node_memory_free_bytes + node_memory_inactive_bytes) / node_memory_total_bytes) * 100
ZFS ARC Hit Rate
sh# ARC hit ratio over 5 minutes rate(node_zfs_arc_hits_total[5m]) / (rate(node_zfs_arc_hits_total[5m]) + rate(node_zfs_arc_misses_total[5m])) * 100
Disk I/O
sh# Read throughput per device rate(node_disk_read_bytes_total[5m])
Network Traffic
sh# Inbound traffic per interface in Mbps rate(node_network_receive_bytes_total{device!="lo0"}[5m]) * 8 / 1000000
Filesystem Usage
sh# Filesystem usage above 80% node_filesystem_avail_bytes / node_filesystem_size_bytes < 0.2
Uptime
sh# System uptime in days (time() - node_boot_time_seconds) / 86400
Alerting Rules
Create alerting rules in /usr/local/etc/prometheus/rules/freebsd.yml:
sh# /usr/local/etc/prometheus/rules/freebsd.yml groups: - name: freebsd-system rules: - alert: HighCPU expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90 for: 10m labels: severity: warning annotations: summary: "High CPU usage on {{ $labels.instance }}" description: "CPU usage is above 90% for more than 10 minutes." - alert: HighMemory expr: (1 - (node_memory_free_bytes + node_memory_inactive_bytes) / node_memory_total_bytes) * 100 > 90 for: 5m labels: severity: warning annotations: summary: "High memory usage on {{ $labels.instance }}" - alert: ZpoolDegraded expr: zpool_health == 0 for: 1m labels: severity: critical annotations: summary: "ZFS pool {{ $labels.pool }} is degraded on {{ $labels.instance }}" - alert: DiskSpaceLow expr: node_filesystem_avail_bytes / node_filesystem_size_bytes < 0.1 for: 5m labels: severity: critical annotations: summary: "Less than 10% disk space on {{ $labels.instance }} mount {{ $labels.mountpoint }}" - alert: InstanceDown expr: up == 0 for: 3m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} is unreachable"
Validate the rules:
shpromtool check rules /usr/local/etc/prometheus/rules/freebsd.yml service prometheus reload
Alertmanager Configuration
Configure Alertmanager at /usr/local/etc/alertmanager/alertmanager.yml:
sh# /usr/local/etc/alertmanager/alertmanager.yml global: smtp_smarthost: "mail.example.com:587" smtp_from: "prometheus@example.com" smtp_auth_username: "prometheus@example.com" smtp_auth_password: "secret" route: group_by: ["alertname", "instance"] group_wait: 30s group_interval: 5m repeat_interval: 4h receiver: "email-team" routes: - match: severity: critical receiver: "email-oncall" receivers: - name: "email-team" email_configs: - to: "team@example.com" - name: "email-oncall" email_configs: - to: "oncall@example.com"
Reload after changes:
shservice alertmanager reload
Storage and Retention
Prometheus stores data on local disk using its custom TSDB format. Default retention is 15 days. Adjust with startup flags:
shsysrc prometheus_args="--storage.tsdb.retention.time=90d --storage.tsdb.path=/var/db/prometheus" service prometheus restart
On FreeBSD with ZFS, place the Prometheus data directory on a dedicated dataset with appropriate recordsize:
shzfs create -o recordsize=1M -o compression=lz4 -o atime=off zroot/prometheus
The 1M recordsize aligns well with TSDB block files. Compression with lz4 is beneficial since metric data compresses well.
For long-term storage beyond what local disk provides, consider Thanos or Cortex as remote storage backends. Both integrate with Prometheus's remote write API.
Security Considerations
Prometheus has no built-in authentication. On FreeBSD, secure it with PF rules and reverse proxy authentication.
Restrict access at the firewall level:
sh# In /etc/pf.conf # Only allow monitoring network to reach Prometheus pass in on egress proto tcp from 10.0.1.0/24 to any port { 9090 9093 9100 } block in on egress proto tcp to any port { 9090 9093 9100 }
For web authentication, place nginx in front of Prometheus with basic auth:
shpkg install nginx
Configure a reverse proxy with authentication in /usr/local/etc/nginx/nginx.conf:
shserver { listen 443 ssl; server_name prometheus.example.com; ssl_certificate /usr/local/etc/ssl/prometheus.crt; ssl_certificate_key /usr/local/etc/ssl/prometheus.key; auth_basic "Prometheus"; auth_basic_user_file /usr/local/etc/nginx/.htpasswd; location / { proxy_pass http://127.0.0.1:9090; } }
Prometheus 2.x also supports native TLS and basic auth via a web configuration file:
sh# /usr/local/etc/prometheus/web.yml tls_server_config: cert_file: /usr/local/etc/ssl/prometheus.crt key_file: /usr/local/etc/ssl/prometheus.key basic_auth_users: admin: $2y$10$hashed_bcrypt_password_here
Enable it:
shsysrc prometheus_args="--web.config.file=/usr/local/etc/prometheus/web.yml --storage.tsdb.retention.time=90d" service prometheus restart
Prometheus vs Zabbix vs Netdata
Zabbix
Zabbix is a traditional enterprise monitoring platform with an agent-based architecture, SQL database backend (PostgreSQL or MySQL), and a PHP web frontend. It has been around since 2001 and is widely used in enterprise environments.
Architecture: Zabbix uses a push model where agents send data to a central server. This is fundamentally different from Prometheus's pull model. Zabbix requires a relational database, which adds operational complexity but supports historical data natively.
Strengths over Prometheus: Zabbix has better out-of-box auto-discovery, built-in authentication and RBAC, native agent-based monitoring without external exporters, and excellent template support for network equipment (SNMP). Its web UI is self-contained.
Weaknesses: Zabbix is heavier to run, its query language is limited compared to PromQL, scaling requires database optimization, and it does not integrate naturally with the Grafana/cloud-native ecosystem. On FreeBSD, the Zabbix server package exists but the web frontend requires a full PHP/Apache or nginx stack.
Netdata
Netdata is a real-time monitoring agent that emphasizes per-second granularity and zero-configuration auto-detection of metrics. It runs a web dashboard on each monitored node.
Architecture: Netdata runs an agent per node that collects metrics every second and serves a local dashboard. Netdata Cloud provides a centralized view. It is designed for real-time troubleshooting rather than historical analysis.
Strengths over Prometheus: Netdata provides per-second resolution out of the box, auto-detects most services without configuration, and has a polished built-in dashboard. Installation on FreeBSD is straightforward with pkg install netdata.
Weaknesses: Netdata's per-second collection consumes more CPU and memory. Its long-term storage capabilities are limited compared to Prometheus TSDB. It lacks the mature alerting pipeline of Prometheus + Alertmanager, and PromQL has no equivalent in Netdata.
When to Use Each
- Prometheus: Best for infrastructure with many nodes, when you need PromQL for complex queries, when integrating with Grafana, or when running Kubernetes alongside FreeBSD.
- Zabbix: Best for traditional enterprise environments with SNMP devices, when built-in RBAC is required, or when templates for network equipment are needed.
- Netdata: Best for real-time troubleshooting on individual servers, quick deployment without configuration, or when per-second visibility is critical.
For most FreeBSD server fleets, Prometheus paired with Grafana offers the best balance of flexibility, query power, and ecosystem integration.
FAQ
How much disk space does Prometheus use on FreeBSD?
Prometheus uses roughly 1-2 bytes per sample in its TSDB. A typical FreeBSD server with node_exporter generates around 500-1000 time series. At a 15-second scrape interval, that is about 5-10 MB per day per server. For 100 servers over 90 days, expect approximately 50-90 GB of storage.
Can Prometheus monitor FreeBSD jails individually?
Yes. Run node_exporter inside each jail with a unique port, or use a single node_exporter on the host with labels to distinguish jails. For jail-specific metrics, the textfile collector approach works well -- run a script inside the jail that writes metrics to a shared directory.
Does node_exporter work with ZFS on FreeBSD?
Yes. The ZFS collector is available on FreeBSD builds of node_exporter. It exposes ARC statistics, L2ARC metrics, and dataset-level information. For zpool health and capacity, supplement with a textfile collector script as shown above.
How do I upgrade Prometheus on FreeBSD?
shpkg upgrade prometheus node_exporter alertmanager service prometheus restart service node_exporter restart service alertmanager restart
Prometheus supports backward-compatible TSDB storage, so upgrades rarely require data migration.
Can Prometheus scrape targets behind a NAT or firewall?
Prometheus requires network access to each target's /metrics endpoint. If targets are behind NAT, options include: running Prometheus inside the private network, using a VPN or SSH tunnel, or deploying Pushgateway for targets that cannot be scraped directly. On FreeBSD, a WireGuard tunnel between Prometheus and remote targets is a clean solution.
How do I back up Prometheus data on FreeBSD?
Use the Prometheus snapshot API:
shcurl -XPOST http://localhost:9090/api/v1/admin/tsdb/snapshot
This creates a snapshot in the data directory. On ZFS, you can also use ZFS snapshots for instant, consistent backups:
shzfs snapshot zroot/prometheus@backup-$(date +%Y%m%d)
Is Prometheus suitable for monitoring 1000+ FreeBSD servers?
Yes, but at that scale, consider federation or Thanos. A single Prometheus instance can handle several hundred targets with default scrape intervals. For larger deployments, use hierarchical federation where regional Prometheus servers aggregate into a global instance, or deploy Thanos for unified querying across multiple Prometheus instances.