Understanding the Issue: ZFS Failed but VM Didn’t Migrate
You might encounter this situation in a Proxmox VE cluster using local ZFS storage on each node:
A ZFS pool (for example
/tank
or/rpool
) fails or goes offline.The node itself stays powered on and reachable.
But your VMs remain stuck, and Proxmox HA doesn’t migrate them to another node.
It’s a confusing scenario — especially if you’ve configured ZFS replication between nodes and expected automatic failover. So why doesn’t Proxmox move those VMs automatically?
Proxmox HA Works at Node Level, Not Storage Level
The key to understanding this lies in how Proxmox HA Manager operates.
Proxmox HA monitors node health, not the storage layer.
This means:
If a node goes offline, HA migrates or restarts VMs on another node.
If the ZFS storage pool fails but the node is still running, HA sees the node as healthy and takes no action.
In short:
Proxmox HA cannot detect or act on local ZFS storage failure.
So even if the storage under a VM becomes unreadable, HA won’t migrate it — because the host node didn’t actually “fail.”
Local ZFS Is Not Shared Storage
Most Proxmox setups using ZFS are configured with local pools per node, e.g.:
Even with ZFS replication enabled, these are still separate local datasets. Replication only creates periodic snapshots on other nodes — not live, accessible copies.
So when a ZFS pool on Node A fails:
The replicated copy on Node B exists, but
It’s a snapshot, not a live disk that HA can boot from automatically.
Why ZFS Replication Doesn’t Trigger Automatic Failover
Proxmox ZFS replication (pve-zsync
or built-in replication jobs) copies datasets between nodes at scheduled intervals.
However, the replicated dataset:
Remains in snapshot form.
Is not automatically promoted or activated as a live volume.
Therefore, after a failure, you must manually start the replicated VM: Until that step happens, the Proxmox HA system cannot automatically restart the VM from the replica.
Recovery Steps After ZFS Failure
If your ZFS pool failed but replication was enabled, here’s how to restore the latest snapshot on another node. Assuming ZFS storage failed on PVE2 node, PVE1 is up and running and has replicated successfully recently, you can run the following command on pve1 to bring up the VM(102).
Move VM config 102.conf to PVE1:
Start the VM manually:
Now the VM will boot from the replicated ZFS dataset on the healthy node.
You can then wipe the ZFS pool and disks on PVE2, create a new ZFS pool using the same name as before, and the existing replication configuration will automatically resume normal operation.
How to Fix It — and Build True HA for ZFS in Proxmox VE
Here are your options to ensure automatic recovery or faster failover.
1. Use Shared Storage for True HA
The most reliable solution is to store all VM disks on shared storage accessible by all cluster nodes.
Recommended options include:
CephFS (native to Proxmox VE)
NFS or iSCSI SAN
ZFS over iSCSI (TrueNAS, StarWind VSAN, etc.)
With shared storage, all nodes can access the same disk image. If one node fails, HA instantly restarts the VM on another host — no replication or promotion required.
2. Automate Failover with ZFS Replication
If you prefer using local ZFS storage per node, you can still achieve partial HA with automation.
Set up Proxmox replication jobs (every 5–15 minutes).
Use a failover script (like
ha-replication-manager
) that:Detects ZFS pool failure.
Promotes the replicated dataset on the standby node.
Registers the VM config.
Starts the VM automatically.
This approach provides storage-aware failover — a practical compromise between full Ceph and manual recovery.
3. Enable ZFS Health Monitoring & Alerts
Proxmox integrates ZFS monitoring tools that can detect pool degradation early.
Useful commands:
To automate notifications, enable ZFS ZED (ZFS Event Daemon):
You’ll receive email alerts for:
Disk failures
Pool degradation
Resilvering or corruption events
This gives you time to replace disks before the pool collapses.
Summary: Why Proxmox Didn’t Migrate and How to Prevent It
Cause | Why Migration Didn’t Happen | Solution |
---|---|---|
Local ZFS pool failed | HA only detects node failure | Use shared storage or automation |
ZFS replication used | Replicated data not live | Promote snapshot and start manually |
Node still reachable | HA assumes node healthy | Add storage-level monitoring |
No alerts configured | Missed early warnings | Enable ZED, SMART, and email alerts |
Recommended High Availability Setup for ZFS in Proxmox VE 9
Component | Recommended Setup |
---|---|
Cluster | 2 or 3 nodes with QDevice |
Storage | ZFS per node + replication |
Backup | Proxmox Backup Server (PBS) |
Failover | Custom replication promotion script |
Alerts | ZFS ZED + SMART monitoring |
Final Thoughts
ZFS provides rock-solid storage reliability in Proxmox VE 9, but automatic HA migration requires shared storage or additional automation.
If your cluster uses local ZFS pools, HA won’t detect storage failure by default — it only reacts to node-level outages.
To achieve true resilience:
Use Ceph or shared ZFS over iSCSI for seamless migration.
Or enhance ZFS replication with automatic failover scripts.
And always keep Proxmox Backup Server running for last-resort recovery.
With these adjustments, your Proxmox ZFS cluster can survive hardware failures, storage faults, and even full node loss — without manual intervention.