What Is ZFS Scrubbing?

ZFS data scrubbing is a process that verifies the integrity of all data stored in a ZFS pool. It checks every block against its checksum and, if redundancy is available (such as mirrors or RAID-Z), automatically repairs corrupted data using healthy copies.

Purpose: To detect and correct silent data corruption (also known as bit rot) before it causes application failures or data loss.


Why Is Scrubbing Necessary in ZFS?

Unlike traditional file systems that assume storage devices return valid data, ZFS assumes nothing. It constantly verifies that what it reads matches what was originally written using checksums.

Data Corruption Can Happen Because Of:

  • Bit rot on hard drives or SSDs
  • Controller or cable failures
  • Firmware bugs
  • RAM errors
  • Cosmic rays (yes, seriously!)

ZFS scrubbing is like a background health scan for your data, catching corruption early and fixing it before it spreads.


How ZFS Scrubbing Works

  1. Checksum Verification
    • ZFS stores a checksum for every data and metadata block.
    • Scrubbing reads all blocks and recalculates the checksum to compare against the stored value.
  2. Automatic Repair
    • If corruption is found and redundancy exists, ZFS pulls a good copy from a mirror or RAID-Z group.
    • The corrupted copy is automatically repaired in place.
  3. Non-Destructive
    • Unlike fsck in traditional systems, ZFS scrubs do not stop services or require unmounting.
    • Scrubbing is performed while the pool is online.
  4. Progress Reporting
    • Scrubbing is asynchronous and can take hours to days depending on pool size and I/O speed.
    • Progress and estimated time remaining are visible via zpool status.

Running a ZFS Scrub Manually

zpool scrub <poolname>

Example:

zpool scrub tank

Check progress:

zpool status

Example output:

  pool: tank
 state: ONLINE
 scrub: scrub in progress since Sun Jun 29 10:00:00 2025
        1.23T scanned at 200M/s, 800G issued at 130M/s, 2.4T total
        0 errors, 0 repaired, 8.00% done, 1:23:00 to go

To stop a scrub:

zpool scrub -s <poolname>

Scheduling Automatic Scrubs

It’s recommended to scrub monthly, or more frequently for:

  • Critical systems
  • High-availability environments
  • Pools with older or large-capacity disks

Example cron job (run once a month):

Edit the crontab for root:

crontab -e

Add:

@monthly zpool scrub tank

Or use systemd timer units if your distro uses systemd.


Scrubbing vs. Resilvering vs. fsck

ActionDescription
ScrubVerifies and optionally repairs data on healthy pool
ResilverRebuilds data on a new or replaced disk
fsckTraditional file system checker (not used with ZFS)

Scrubbing is proactive. Resilvering is reactive.


Best Practices for ZFS Scrubbing

1. Schedule Regular Scrubs

Use cron/systemd to scrub monthly or weekly based on usage.

2. Monitor Results

Always check the output of zpool status after a scrub to:

  • Verify that no errors occurred
  • Identify silent corruption early

3. Enable Alerts

Set up alerting via email or Proxmox notifications when scrub repairs anything.

4. Scrub Idle Pools

Scrubs are I/O-intensive. Run them during low-usage hours to avoid performance impact.

5. Check for Aging Drives

Frequent checksum errors may indicate failing drives even if S.M.A.R.T. shows no issues.


Performance Impact of Scrubbing

Scrubbing is I/O-heavy but CPU-light:

  • It reads every block in the pool, so it can impact disk performance.
  • It doesn’t block other operations, but may slow down read/write workloads.

Tips to Mitigate Performance Issues:

  • Run scrubs during off-peak hours.
  • Use ZFS nice I/O classes to deprioritize scrub activity (advanced tuning).
  • On large pools, spread scrubs across different days if multiple pools exist.

Scrub Reporting Tools & Automation

If you’re managing ZFS on a fleet of servers or want visual reports:

Use:

  • zfs-zed (ZFS Event Daemon) for automatic alerts
  • arc_summary.py and zpool-status-report.sh for human-readable reports
  • Proxmox VE’s GUI — shows scrub status directly in the dashboard

Case Study Example: Silent Corruption Repair

Let’s say a pool has mirrored vdevs. During a scrub, ZFS finds that one block on disk A has a checksum mismatch. Disk B has the correct version.

  • ZFS logs the event
  • Reads the correct data from Disk B
  • Overwrites the corrupted block on Disk A
  • Updates the status to “Repaired 1 block”

Without the scrub, the system would continue reading from the faulty disk — and your backup or application might silently ingest corrupted data.


ZFS Scrub Metrics (from zpool status)

FieldMeaning
scannedTotal amount of data examined
issuedAmount of data actually read
errorsAny errors detected during scrub
repairedNumber of blocks fixed (should be 0)
to goEstimated time remaining

Summary

FeatureZFS Scrubbing Advantage
Detects Silent Corruption✅ Yes
Runs While Online✅ Yes
Auto-Repair with Redundancy✅ Yes
Manual or Scheduled✅ Both supported
Alerts/Logs Supported✅ Via zfs-zed, systemd, Proxmox, etc.
Frequency Recommendation✅ Monthly (minimum), weekly for critical data

Final Thoughts

ZFS scrubbing is one of the key differentiators of ZFS over traditional file systems. It’s a simple yet powerful way to protect your data from the silent corruption that can ruin backups, databases, and user trust.

“Backups protect against disasters. Scrubs protect against decay.”