I recently went through the arduous process of transplanting my home server into a new case, and with it, I decided to install some more storage. Not just any storage, but used enterprise SAS drives, complete with a LSI card I found on eBay for $40. I have very little hands-on experience with these drives, and I knew I was getting myself into some potentially painful troubleshooting by choosing to go this route, but what I didn't anticipate was the fix. After multiple hours of throwing software solutions at a hardware problem, I threw up a Hail Mary in the form of some electrical tape over a couple of specific PCIe pins, and that was the miraculous fix I needed.
If you don't already have one, an HBA card is a great addition to your home server
A reliable way to add more storage
Power was the first problem
Not a good first symptom, but also not incredibly telling either
The first problem I ran into wasn't even really a problem, but it wasn't encouraging. Upon every attempted cold boot with the HBA card installed, the system would trip the overcurrent and power cycle multiple times before eventually coming alive. This isn't encouraging, but also, my 650W PSU should be able to handle 2 SAS drives and an HBA card with no auxiliary power. This was the first sign of things to come, and it would undermine the rest of my troubleshooting efforts. I had no idea if my PSU really was to blame or not; it was pretty old and could certainly be due for a replacement, but I wanted to get this system up and running sooner rather than later, and didn't have a spare kicking around.
I used my spare PCIe slot for something boring, and it became my favorite PC upgrade
It fixed my biggest networking problem
The HBA option ROM was the next roadblock
Or so I thought
Once the system actually booted, it would skip through the BIOS splash screen after boot and go straight into the LSI card's ROM, which would show me the drives enumerating, the card reporting as functional, and allowing me to enter into the configuration menu. If I let it try and boot into Proxmox though, it would freeze up on the following BIOS splash screen. Not a hard crash, no error, just a complete lock-up.
This made me suspect that I needed to wipe the option ROM off of the card so that the system wouldn't attempt to boot it. I first tried to disable CSM and block any avenue the system might take to attempt a boot from the ROM, but nothing would work. BIOS settings weren't holding between boots, either, which is likely due to a dead CMOS battery.
Attempting to wipe the ROM took me down a multi-hour path of different methods to flash the card, with only one of them eventually working. The flashing tools fought me across four different environments before I finally cleared the ROM through a modified EFI shell and sas2flash. I reflashed the card with its SAS address intact, rebooted, watched the splash screen appear, and it hung in exactly the same spot. Back to square one.
N100 mini PCs quietly killed the Raspberry Pi for home servers
Your Raspberry Pi home server is obsolete, and the N100 is why
What was actually going on with my HBA card
This could happen to you
Here's what was really going on. Tucked into the PCIe slot, alongside the high-speed data lanes everyone thinks about, sits a tiny, slow side-channel called the SMBus. It's a management bus, and it exists for housekeeping like reading a sensor or identifying a card, not for moving your data.
My HBA and my motherboard were both trying to use that bus during the earliest moments of POST, and they were stepping on each other badly enough to stall the whole boot. This doesn't happen to every card and motherboard combo, but it's common with specific LSI cards in consumer motherboards. My workstation PC would boot just fine with the card installed, which really confused me until I found a very informative video from Art of Server, showing a tape mod for my specific card and describing the failure mode I was experiencing.
What did I have to lose? Everything else hadn't worked, so I got out the electrical tape, cut a very small strip and taped the pins described, and I was booted into Proxmox with my LSI card and SAS drives detected.
The reason why this works is very simple. The two SMBus pins on the connector, positions B5 and B6, are defined by PCI-SIG as optional, with no required behavior for whatever an add-in card hangs off them. Simply covering these two pins on the card solves the conflict that happens at boot time. Kapton tape is the preferred material to do this, but I used normal electrical tape to good effect.
I almost bought a used Nvidia Tesla GPU for my home lab, then I read what owners actually deal with
Tesla cards are the best deal around when you look at VRAM, but everything else makes them a bit of a hassle
Take a moment to read about the quirks of your hardware before diving in
This entire saga could've been avoided if I had done a little bit more research about the LSI card I bought. I had researched the drives quite heavily and knew what to expect, but neglected to do enough preliminary reading about potential quirks like this. This issue is quite common, and I could've saved myself many painful hours of troubleshooting. Nevertheless, I still don't regret going with SAS drives for my NAS, as the HBA gives me plenty of space for cheap enterprise storage in the future.
