How to Use DTrace on FreeBSD for Performance Analysis

DTrace is the most powerful observability tool available on FreeBSD. It lets you instrument a running kernel and userland processes in real time, with negligible overhead when not active, and without rebooting or recompiling anything. If you have ever stared at top output wondering why a process is slow, DTrace gives you the answer.

This tutorial walks through enabling DTrace on FreeBSD, understanding its core concepts, running practical one-liners, writing D scripts, and applying the tool to real-world performance problems.

What Is DTrace

DTrace -- Dynamic Tracing -- is a comprehensive tracing framework originally developed by Sun Microsystems for Solaris 10 in 2005. Bryan Cantrill, Mike Shapiro, and Adam Leventhal created it to solve a fundamental problem: production systems break in ways that are impossible to reproduce in development, and traditional debugging tools either cannot observe the kernel safely or impose too much overhead.

FreeBSD adopted DTrace starting with version 7.1, and it has been a first-class subsystem ever since. The FreeBSD implementation covers both kernel and userland tracing and shares the same D language syntax as the Solaris and illumos versions. On FreeBSD 13 and later, DTrace support is mature, stable, and used in production by hosting providers, storage vendors, and anyone who runs serious infrastructure on FreeBSD.

DTrace works by dynamically inserting instrumentation points (probes) into running code. When a probe fires, it executes a small, safe program you define. The key properties that make DTrace suitable for production use:

Zero overhead when disabled. Probes that are not actively enabled impose no measurable cost.
Safety guarantees. The D language is intentionally limited. You cannot write infinite loops, dereference arbitrary pointers, or corrupt kernel state.
Dynamic scope. You can trace any kernel function, any system call, any userland function in a running process -- without restarting anything.

Enabling DTrace on FreeBSD

DTrace requires kernel module support. On a stock FreeBSD installation, the modules are present but not loaded by default.

Load DTrace Modules Immediately

To enable DTrace right now without rebooting:

bash
kldload dtraceall

This single command loads the DTrace framework and all available providers. Verify it worked:

bash
kldstat | grep dtrace

Expected output:

shell
 7    1 0xffffffff82a00000 68a80  dtrace.ko
 8    1 0xffffffff82a70000 11680  dtraceall.ko
 9    1 0xffffffff82a90000  5200  dtrace_test.ko

Make DTrace Persistent Across Reboots

Add the following to /boot/loader.conf:

shell
dtraceall_load="YES"

After a reboot, the modules load automatically. You can confirm with:

bash
dtrace -l | wc -l

On a typical FreeBSD 14 system, this returns upward of 50,000 available probes.

Permissions

DTrace requires root privileges. All commands in this tutorial assume you are running as root or via sudo.

DTrace Core Concepts

Before running commands, you need to understand four concepts: probes, providers, predicates, and actions.

Probes

A probe is a named instrumentation point. Every probe has a four-part name:

shell
provider:module:function:name

For example, syscall::open:entry means: the syscall provider, any module, the open function, at the entry point. You can use wildcards -- syscall:::entry matches the entry point of every system call.

Providers

A provider is a library of related probes. FreeBSD ships several providers:

| Provider | What It Traces |

|----------|---------------|

| syscall | System call entry and return points |

| fbt | Function Boundary Tracing -- every kernel function |

| profile | Timed sampling at fixed intervals |

| io | Disk I/O start and completion |

| proc | Process creation, execution, exit |

| tcp | TCP connection state changes |

| udp | UDP send and receive |

| sched | CPU scheduler events |

| lockstat | Kernel lock contention |

| vfs | Virtual filesystem operations |

Predicates

A predicate is a filter enclosed in /slashes/ that controls when the action fires:

d
syscall::read:entry
/execname == "nginx"/
{
    /* only traces reads by nginx */
}

Actions and Aggregations

Actions are the code inside { } that runs when a probe fires. The most important action is aggregation using the @ operator. Aggregations efficiently summarize data in-kernel without copying every event to userland. Common aggregating functions: count(), sum(), avg(), min(), max(), quantize(), lquantize().

Your First DTrace One-Liners

Here are practical one-liners you can run immediately. Each solves a real operational question.

1. Count System Calls by Process

Which processes are making the most system calls?

bash
dtrace -n 'syscall:::entry { @[execname] = count(); }'

Let it run for a few seconds, then press Ctrl-C. Output:

shell
  sshd                                                             42
  cron                                                             87
  devd                                                            134
  nginx                                                          3841
  postgres                                                      12407

2. Trace File Opens

See every file being opened system-wide:

bash
dtrace -n 'syscall::openat:entry { printf("%s %s", execname, copyinstr(arg1)); }'

Output:

shell
  0  23451  openat:entry nginx /usr/local/etc/nginx/mime.types
  0  23451  openat:entry nginx /var/log/nginx/access.log
  0  23890  openat:entry postgres /var/db/postgres/data16/base/16384/1259

3. Count System Calls by Type

Which system calls are most frequent?

bash
dtrace -n 'syscall:::entry { @[probefunc] = count(); }'

Output:

shell
  sigprocmask                                                     412
  clock_gettime                                                  1088
  read                                                           2341
  write                                                          2899
  select                                                         4510

4. Profile CPU Usage by Function

What kernel functions are consuming the most CPU?

bash
dtrace -n 'profile-997 /arg0/ { @[func(arg0)] = count(); }'

The profile-997 probe fires 997 times per second (a prime number avoids aliasing with periodic tasks). arg0 is the kernel program counter. Output:

shell
  kernel`spinlock_enter                                            12
  kernel`uma_zalloc_arg                                            34
  kernel`vm_fault_hold                                             89
  kernel`bcopy                                                    201

5. Track New TCP Connections

Monitor TCP connections as they are established:

bash
dtrace -n 'tcp:::accept-established { printf("%s:%d <- %s:%d",
    args[3]->tcps_laddr, args[3]->tcps_lport,
    args[3]->tcps_raddr, args[3]->tcps_rport); }'

Output:

shell
  nginx:443 <- 198.51.100.23:49182
  nginx:443 <- 203.0.113.45:51907
  sshd:22 <- 192.168.1.5:62340

6. Measure Disk I/O Latency Distribution

How long do disk I/O operations take?

bash
dtrace -n 'io:::start { ts[arg0] = timestamp; }
    io:::done /ts[arg0]/ {
    @["usec"] = quantize((timestamp - ts[arg0]) / 1000);
    ts[arg0] = 0; }'

Output (a power-of-two histogram in microseconds):

shell
  usec
           value  ------------- Distribution ------------- count
              32 |                                         0
              64 |@@@@@@@@@@@@                             312
             128 |@@@@@@@@@@@@@@@@@@                       467
             256 |@@@@@@                                   158
             512 |@@                                       53
            1024 |@                                        19
            2048 |                                         3

7. Watch Process Creation

Log every new process as it spawns:

bash
dtrace -n 'proc:::exec-success { printf("%d %s -> %s", pid, curpsinfo->pr_ppname, execname); }'

Output:

shell
  48231 cron -> /bin/sh
  48232 sh -> /usr/local/bin/php
  48240 sshd -> sshd

8. Trace DNS Lookups (UDP Port 53)

bash
dtrace -n 'udp:::send /args[4]->udp_dport == 53/ {
    printf("%s -> DNS query", execname); }'

9. Count Read Sizes by Process

What read sizes are processes requesting?

bash
dtrace -n 'syscall::read:entry { @[execname] = quantize(arg2); }'

10. Measure System Call Latency

How long does each read() call take?

bash
dtrace -n 'syscall::read:entry { self->ts = timestamp; }
    syscall::read:return /self->ts/ {
    @[execname] = avg(timestamp - self->ts);
    self->ts = 0; }'

Output shows average nanoseconds per read, grouped by process:

shell
  nginx                                                          4820
  postgres                                                      18340
  rsync                                                        231000

11. Find Processes Writing to Disk Most

bash
dtrace -n 'io:::start /args[0]->b_flags & B_WRITE/ {
    @[execname] = sum(args[0]->b_bcount); }'

D Language Basics

The D language is purpose-built for DTrace. It resembles C but with critical restrictions that guarantee safety.

Variables

D supports three variable scopes:

Clause-local (this->var): scoped to a single probe clause. Cheapest.
Thread-local (self->var): scoped to a thread. Use for timing measurements across entry/return pairs.
Global (var): shared across all probes. Use sparingly.

Predicates

Predicates filter probe firings. They appear between slashes before the action block:

d
syscall::write:entry
/execname == "nginx" && arg2 > 4096/
{
    printf("Large write: %d bytes", arg2);
}

Aggregations

Aggregations are the most important D feature. They use the @ syntax and compute summaries in-kernel:

d
@counts[execname] = count();
@sizes[execname] = quantize(arg2);
@totals[execname, probefunc] = sum(arg2);

Built-in Functions

| Function | Purpose |

|----------|---------|

| printf() | Formatted output |

| trace() | Print a single value |

| stack() | Print kernel stack trace |

| ustack() | Print userland stack trace |

| copyinstr() | Copy string from userland |

| timestamp | Nanosecond timestamp |

| walltimestamp | Wall clock in nanoseconds |

| execname | Current process name |

| pid | Current process ID |

| tid | Current thread ID |

| curpsinfo | Process state info struct |

Writing DTrace Scripts

For anything beyond a one-liner, save your D program to a .d file. Here is a script that tracks the top system calls for a specific process, with latency:

Create /root/syscall_latency.d:

d
#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option switchrate=5hz

dtrace:::BEGIN
{
    printf("Tracing syscall latency for PID %d... Ctrl-C to stop.\n", $target);
}

syscall:::entry
/pid == $target/
{
    self->ts = timestamp;
}

syscall:::return
/self->ts/
{
    @time[probefunc] = avg(timestamp - self->ts);
    @calls[probefunc] = count();
    self->ts = 0;
}

dtrace:::END
{
    printf("\n%-24s %12s %12s\n", "SYSCALL", "COUNT", "AVG(ns)");
    printa("%-24s %@12d %@12d\n", @calls, @time);
}

Run it against a specific process:

bash
chmod +x /root/syscall_latency.d
dtrace -s /root/syscall_latency.d -p $(pgrep -o nginx)

Output after Ctrl-C:

shell
Tracing syscall latency for PID 1023... Ctrl-C to stop.

SYSCALL                         COUNT       AVG(ns)
writev                            127         18402
recvfrom                          340          4210
epoll_ctl                          89          1870
accept4                            42         29500
sendfile                           38        182300

Real-World Example: Finding a Performance Bottleneck

Scenario: A web application on FreeBSD is responding slowly. top shows the server is not CPU-bound and has free memory. Something else is wrong.

Step 1: Identify which system calls the application spends time in.

bash
dtrace -n 'syscall:::entry /execname == "python3.11"/ { self->ts = timestamp; }
    syscall:::return /self->ts/ {
    @[probefunc] = sum(timestamp - self->ts); self->ts = 0; }' -c 'sleep 10'

Result shows fdatasync consuming 89% of traced time.

Step 2: See how often fdatasync is called and its latency distribution.

bash
dtrace -n 'syscall::fdatasync:entry /execname == "python3.11"/ { self->ts = timestamp; }
    syscall::fdatasync:return /self->ts/ {
    @["fdatasync_usec"] = quantize((timestamp - self->ts) / 1000);
    self->ts = 0; }' -c 'sleep 10'

Output:

shell
  fdatasync_usec
           value  ------------- Distribution ------------- count
            1024 |                                         0
            2048 |@@                                       14
            4096 |@@@@@@@@@@@@@@@@@@@@@@@@                 187
            8192 |@@@@@@@@@@@@                             93
           16384 |@@                                       11
           32768 |                                         2

Each fdatasync takes 4-8ms. The application calls it on every request because it writes session data to disk synchronously.

Step 3: Find the stack trace causing the sync.

bash
dtrace -n 'syscall::fdatasync:entry /execname == "python3.11"/ { ustack(20); }'

This reveals the exact code path. The fix: switch session storage from file-backed to either tmpfs or a memory-based session store. After the change, request latency drops from 12ms to 1.8ms.

This kind of root-cause analysis is nearly impossible with traditional tools. top shows CPU and memory. truss shows system calls but with massive overhead. DTrace shows both with production-safe overhead. For more on system-level tuning, see our guide on FreeBSD performance tuning.

Real-World Example: Tracing Network Issues

Scenario: Intermittent connection timeouts to a backend service running on localhost.

Step 1: Watch TCP retransmits.

bash
dtrace -n 'tcp:::send /args[4]->tcp_flags & TH_SYN/ {
    printf("%s:%d -> %s:%d (retransmit: %d)", args[3]->tcps_laddr,
    args[3]->tcps_lport, args[3]->tcps_raddr, args[3]->tcps_rport,
    args[3]->tcps_retransmit); }'

Step 2: Measure TCP connection establishment time.

d
#!/usr/sbin/dtrace -s

#pragma D option quiet

tcp:::connect-request
{
    self->start = timestamp;
}

tcp:::connect-established
/self->start/
{
    @["connect_usec"] = quantize((timestamp - self->start) / 1000);
    self->start = 0;
}

tcp:::connect-refused
/self->start/
{
    @refused[args[3]->tcps_raddr, args[3]->tcps_rport] = count();
    self->start = 0;
}

dtrace:::END
{
    printf("\nConnection latency distribution:\n");
    printa(@);
    printf("\nRefused connections:\n");
    printa("  %s:%d  %@d\n", @refused);
}

The histogram reveals a bimodal distribution: most connections complete in under 200 microseconds, but a cluster takes 1-3 seconds -- exactly the retransmit timeout. Cross-referencing with lockstat reveals listen backlog overflow under load. The fix: increase kern.ipc.soacceptqueue and the application's listen backlog.

For more on monitoring network and system health, see our FreeBSD server monitoring guide.

DTrace vs Other FreeBSD Diagnostic Tools

DTrace is not the only tool available. Here is how it compares.

|------|-------|----------|---------|-------------|

When to use DTrace over alternatives:

You need to trace across multiple processes or the kernel simultaneously.
You need aggregated statistics, not raw event logs.
The system is in production and you cannot afford truss-level overhead.
You need to correlate events across subsystems (disk I/O with network, syscalls with scheduler).

When other tools are sufficient:

top is enough to check CPU and memory at a glance.
truss -p PID works fine for quick debugging of a single process in development.
pmcstat is better for hardware-level profiling (cache misses, branch mispredictions).

For general troubleshooting workflows that combine these tools, see our FreeBSD troubleshooting guide.

Safety and Overhead Considerations

DTrace was designed for production use, but you should understand its characteristics:

When DTrace is safe:

Probes that are not enabled have zero overhead. The probe sites are no-ops until you activate them.
Aggregations are lock-free and per-CPU. Even high-frequency probes like profile-997 have minimal impact.
The D language prevents infinite loops, unbounded memory allocation, and arbitrary memory access.

When to be careful:

High-frequency probes with printf. If you trace every syscall:::entry with printf, you generate enormous output. Use aggregations instead.
The fbt provider. Function Boundary Tracing can instrument tens of thousands of kernel functions. Enabling all of them (fbt:::entry) on a busy system generates significant data. Be specific.
copyinstr() on hot paths. Copying strings from userland on every system call entry adds latency. Limit with predicates.
Thread-local variables. If a probe fires but the return probe never does (process killed, error path), thread-local variables leak. Use self->ts = 0 in all exit paths.

Practical guidelines:

Start with aggregations, not printf.
Always use predicates to limit scope (specific PID, process name, or condition).
Test one-liners on a staging system before running on production.
Run with a time limit: dtrace -n '...' -c 'sleep 30' automatically stops after 30 seconds.
Monitor the dtrace:::ERROR probe count -- if D programs hit errors, they are silently dropped.

Frequently Asked Questions

Does DTrace work on FreeBSD jails?

DTrace can trace processes inside jails, but you must run dtrace from the host. The jail itself cannot load kernel modules or access /dev/dtrace. From the host, filter by jail ID or process name to isolate jail-specific activity.

Can I trace userland applications with DTrace on FreeBSD?

Yes. The pid provider lets you trace any function in a userland process. For example, to trace calls to malloc in a specific process:

bash
dtrace -n 'pid$target::malloc:entry { @["malloc calls"] = count(); }' -p 12345

Applications compiled with -fno-omit-frame-pointer produce better stack traces. Some languages (Python, Ruby, Node.js) have USDT providers that expose application-level probes.

How does DTrace impact system performance?

When no probes are enabled, the impact is zero -- DTrace instruments code dynamically only when you activate probes. When probes are active, overhead depends on probe frequency and the complexity of your D program. Aggregation-based scripts on moderate-frequency probes (thousands of events per second) typically add less than 1% overhead. High-frequency printf-based tracing can add noticeable latency.

Can I use DTrace to trace ZFS internals?

Yes, and this is one of the most powerful combinations on FreeBSD. The fbt provider can instrument every function in the ZFS kernel module. For example, to measure zfs_read latency:

bash
dtrace -n 'fbt::zfs_read:entry { self->ts = timestamp; }
    fbt::zfs_read:return /self->ts/ {
    @["zfs_read_ns"] = quantize(timestamp - self->ts);
    self->ts = 0; }'

You can trace ARC hits and misses, ZIO pipeline stages, and transaction group commits. This is invaluable for tuning ZFS on FreeBSD.

What is the difference between DTrace on FreeBSD and Linux BPF/bpftrace?

Linux adopted BPF (Berkeley Packet Filter) as its tracing framework instead of DTrace. bpftrace on Linux uses similar syntax to DTrace and covers much of the same ground. The key differences: FreeBSD DTrace is more mature and battle-tested, has a cleaner provider model, and does not require a separate compilation toolchain. Linux BPF has a larger community and more tooling (BCC, libbpf). If you know DTrace, you can pick up bpftrace quickly -- the concepts are nearly identical.

How do I list all available probes?

List every probe on the system:

bash
dtrace -l | wc -l

List probes for a specific provider:

bash
dtrace -l -P syscall

Search for probes matching a pattern:

bash
dtrace -l -n 'tcp:::*'

Summary

DTrace transforms FreeBSD from a system you monitor to a system you understand. The one-liners in this guide solve the problems you encounter daily -- identifying which process is hammering the disk, why a service is slow, where network latency comes from. The D scripting language lets you build targeted analysis tools for problems that no pre-built monitoring solution can address.

Start with the one-liners. Graduate to .d scripts when you need structured analysis. Use aggregations instead of printf for anything in production. The combination of DTrace with FreeBSD's stable kernel and ZFS makes for an operating system where performance problems do not stay mysteries for long.