Proxmox VE HA 3 Node Cluster Design

Proxmox High Availability (HA) ensures that virtual machines (VMs) and containers (CTs) automatically restart on another node in a cluster if the original node fails. It uses Corosync for cluster communication and quorum, and requires at least 3 nodes for reliable operation. HA is enabled per VM or container and depends on shared or replicated storage. While it does not offer live failover (like memory state transfer), it provides automated recovery with minimal downtime, making it ideal for critical workloads in a Proxmox cluster.

Why 3 Nodes Are Required for HA in Proxmox

1. Quorum

Proxmox VE uses Corosync for cluster communication and quorum-based decision making.
Quorum requires a majority of nodes to agree on the current state of the cluster.
In a 2-node setup, you can lose quorum with just one node down, leading to split-brain or HA services stopping.

2. Failover Logic

Proxmox HA relies on the pve-ha-crm (Cluster Resource Manager) and pve-ha-lrm (Local Resource Manager) to detect failure and move services.
With 3 nodes, if one fails:
- The other two still maintain quorum
- HA logic can decide to restart the failed VMs/CTs on healthy nodes

2-Node HA is NOT Recommended

While a 2-node cluster is technically possible in Proxmox VE, HA is not supported or reliable.
You can add a QDevice (external quorum vote) to simulate a 3rd vote, but that adds complexity and still has limitations.

Recommended Minimum for HA

Nodes	HA Capable	Notes
1	No	Single node = no cluster or HA
2	No*	Possible with QDevice, but fragile
3+	Yes	Full support for HA and stable quorum logic

Best practice: Start with 3 nodes and scale in odd numbers (e.g., 3, 5, 7) for quorum stability.

Cluster Hardware Overview

Component	Node 1	Node 2	Node 3
CPU	Xeon / Ryzen (8+ cores)	Same	Same
RAM	64–128 GB ECC	Same	Same
Boot Drive	256–512 GB SSD (ZFS mirror recommended)	Same	Same
VM Storage	Ceph OSD SSDs or NFS-backed drives	Same	Same
Network NICs	2x 1G + 2x 10G NICs	Same	Same

Network Design

Network Role	Description	NIC Type	VLAN / Separate Phys
Management	Web GUI, SSH, API	1G NIC	VLAN 10 or Physical
Corosync	Cluster heartbeat traffic	1G or 10G NIC	VLAN 20 or Physical
VM/Storage LAN	Ceph, NFS, iSCSI, VM traffic	10G NIC	VLAN 30 or Physical
Backup LAN	Proxmox Backup Server, replication	Optional	VLAN 40

Best Practice: Use dedicated or VLAN-isolated networks for Corosync and Ceph/Storage traffic to avoid congestion and latency.

Storage Design

Option 1: Ceph (Recommended for full HA)

3-node Ceph storage with 3 OSDs per node
Use enterprise SSDs or NVMe (min. 2 TB per node)
Replication: 3x
Journals: Use separate SSDs or WAL/DB partitions

Option 2: Shared NFS/iSCSI

NFS or iSCSI from TrueNAS or similar high-availability NAS
Accessible to all 3 nodes
VMs stored on shared volume
No local-only storage for HA VMs

Option 3: ZFS with Replication (Low-cost HA)

Each node has local ZFS mirror
Use ZFS replication (manual or scheduled)
Enables semi-HA (failover with some delay)

Fencing and Quorum

Feature	Description
Quorum	Needs 2 of 3 nodes online
Corosync Rings	2 (ring0 and ring1 for redundancy)
Fencing	Software-based via Proxmox HA stack
No STONITH	Proxmox uses internal fencing logic

VM HA Configuration

Enable HA per VM in Datacenter > HA
Group critical VMs into HA Groups with node preferences
Avoid overcommitting all nodes with HA VMs
Enable no-failback if you don’t want VMs to jump back after recovery

Backup Strategy

Deploy Proxmox Backup Server on separate node (physical or VM with external storage)
Run daily incremental backups of HA-enabled VMs
Backup stored on ZFS dataset or external NAS

Maintenance & Monitoring

Task	Frequency	Tools/Notes
Corosync link test	Monthly	`corosync-cfgtool`, `ping`, `traceroute`
Disk health check	Weekly	`smartctl`, Ceph dashboard
Backup restore test	Monthly	Restore to non-production node
Resource usage monitor	Daily	Proxmox GUI, Nagios

Configuration Checklist

Proxmox VE installed and up to date on all 3 nodes
Cluster created using pvecm create and pvecm add
Corosync dual-ring configured
Shared or replicated storage accessible on all nodes
VMs created on shared storage (not local)
HA groups defined for critical workloads
Proxmox Backup Server connected and tested
Monitoring and alerts configured

Get in touch with Saturn ME today for a free Proxmox consulting session—no strings attached.

June 22, 2025

Proxmox VE HA 3 Node Cluster Design

Proxmox VE HA 3 Node Cluster Design

Why 3 Nodes Are Required for HA in Proxmox

1. Quorum

2. Failover Logic

2-Node HA is NOT Recommended

Recommended Minimum for HA

Cluster Hardware Overview

Network Design

Storage Design

Option 1: Ceph (Recommended for full HA)

Option 2: Shared NFS/iSCSI

Option 3: ZFS with Replication (Low-cost HA)

Fencing and Quorum

VM HA Configuration

Backup Strategy

Maintenance & Monitoring

Configuration Checklist

Leave A Comment

| OUR SERVICES

| CONNECT

| LOCATIONS

| OUR SERVICES

| CONNECT

| LOCATIONS

June 22, 2025

Proxmox VE HA 3 Node Cluster Design

Proxmox VE HA 3 Node Cluster Design

Why 3 Nodes Are Required for HA in Proxmox

1. Quorum

2. Failover Logic

2-Node HA is NOT Recommended

Recommended Minimum for HA

Cluster Hardware Overview

Network Design

Storage Design

Option 1: Ceph (Recommended for full HA)

Option 2: Shared NFS/iSCSI

Option 3: ZFS with Replication (Low-cost HA)

Fencing and Quorum

VM HA Configuration

Backup Strategy

Maintenance & Monitoring

Configuration Checklist

Leave A Comment

Related Posts

Hotplug Options in Proxmox VE: Dynamic Resource Management for Virtual Machines

Proxmox Datacenter Manager (PDM): Centralized Management for Multi-Cluster Infrastructure

Proxmox VE Storage Options: Comprehensive Comparison Guide

| OUR SERVICES

| CONNECT

| LOCATIONS

| OUR SERVICES

| CONNECT

| LOCATIONS