Once you accumulate enough self-hosted services and reduce your reliance on someone else's cloud, a murky question starts to hang in the air. Sure, you save untold riches every month by not paying for subscriptions. Still, part of that subscription fee is towards server maintenance and resilience so that the cloud provider has near-perfect uptime, and you can access your data at any time you need.
The service provider handles this by shuffling resources around, load balancing, and having multiple failover modes so that running services can be replaced or shunted to another server in moments while data stays accessible. And do you know what? You can be that service provider, if you want, because Proxmox clusters can be set up in High Availability (HA) mode, so that three (or more!) nodes can work together as one server, and shuffle your virtual machines between them in case of any one node having hardware or software failures.
Just like those cloud providers, you'll need at least three Proxmox nodes to do this, plus one more network location to store the data they share. And, while you can play with the modes and get a sense of how things work in a virtual environment, to set the real thing up you need three physical servers, plus a data store. The easiest way is with three mini PCs, preferably with Intel NICs and CPUs, because that makes your life easier with virtualization in Proxmox.
5 reasons I virtualize OPNsense in Proxmox
It may seem strange, but virtualizing your router and firewall can be better than running it on bare metal.
Proxmox is as powerful as you want it to be
Have the resilience and reliability of a giant cloud provider, inside the comfort of your own home
If you've got your home lab set up, with high quality components designed for uninterrupted use, running on an uninterrupted power supply, and with as many error-checking and self-healing features like ECC RAM, RAID levels with disk redundancy and file systems like ZFS, you might be wondering why you'd go to all the trouble of setting a High Availability Proxmox cluster up in the first place. After all, your home lab is mostly always available and you're there to fix things if not.
But what if you needed to shut down part of your home lab to repair the hardware? Because you've been building out services, that might drop your DNS cache, firewall, router, or other self-hosted services. The same goes if you have a software issue, and need to swap out a VM, or fix a Docker configuration, and pull a new container down. Either way, your network will be partially down until you fix it, and that's not acceptable if you live with others.
That's where HA clusters come in. By installing three Proxmox nodes and clustering them together, they can act as a single unit, querying each other to reach a quorum on which nodes are available and shuffle VMs around if needed. You then set up shared storage on another device for VMs and containers, and point the services running on the cluster at those.
It's slightly painful to get running, and my next stage is to get it going on bare metal, but I can see how rewarding it will be for my home lab. I'll be able to swap out hardware without being asked when the internet will be back online, or fix configuration errors, then swap them into the running services so it looks like they never stopped at all. It's not perfect because the HA manager has a rough failover period of two minutes, so things could be out for that long, but that's less time than it takes to reboot the average router.
4 reasons you should run your own DNS server with Unbound
Upgrading your network with a self-hosted DNS server is one of the best improvements you can make
It's best for services you can't live without
Security, adblocking, DNS resolvers, and even your router all benefit
As you can imagine, this opens a world of uptime for your most necessary services. Common VMs for high-availability clusters include webservers, adblockers like Pi-hole, self-hosted DNS servers, uptime monitoring tools, search engines, and the routers or firewalls that run your network. None of these can afford to be down, not without a backup failover mode, and with HA Proxmox clusters, you have a failover mode in place.
If one of the nodes goes down, Proxmox knows what VMs or containers are running on it, and will wait long enough to see if it comes back online before transferring resources to one of the other nodes. It does this automatically, and for a small cluster you don't have to worry too much about the rules it uses for moving VMs. And it's not exactly like you have multiple copies of the same server, because you might have those VMs spread out, so there's no single point of failure, for however short a period before the HA manager rights the ship.
5 reasons you should put some of your self-hosted services on different machines
Don't put all your eggs in one basket
Proxmox High Availability is cool, but it's an expensive upgrade for your home lab
Home labs are one thing. You can take network resources or VMs down for maintenance whenever you feel like it because you're the only user. However, setting up high-availability clusters is a way to get used to a corporate IT environment. Even in a home setup, you should still strive for zero unplanned downtime, because those are never fun times.
Getting used to clusters gives you insight into how quorum works between nodes, how storage should be planned for, and how in-transit data is handled when the server resource it's written to goes down. It's good for your home network, it's good for learning opportunities, and it gives you another reason to add more network equipment to your home lab. All of that sounds like a win to me.
