I think the board has reached the end of the road. 😅
I think the board has reached the end of the road. 😅
Iiinteresting. I’m on the larger AB350-Gaming 3 and it’s got REV: 1.0 printed on it. No problems with the 5950X so far. 🤐 Either sheer luck or there could have been updated units before they officially changed the rev marking.
On paper it should support it. I’m assuming it’s the ASRock AB350M. With a certain BIOS version of course. What’s wrong with it?
B350 isn’t a very fast chipset to begin with
For sure.
I’m willing to bet the CPU in such a motherboard isn’t exactly current-gen either.
Reasonable bet, but it’s a Ryzen 9 5950X with 64GB of RAM. I’m pretty proud of how far I’ve managed to stretch this board. 😆 At this point I’m waiting for blown caps, but the case temp is pretty low so it may end up trucking along for surprisingly long time.
Are you sure you’re even running at PCIe 3.0 speeds too?
So given the CPU, it should be PCIe 3.0, but that doesn’t remove any of the queues/scheduling suspicions for the chipset.
I’m now replicating data out of this pool and the read load looks perfectly balanced. Bandwidth’s fine too. I think I have no choice but to benchmark the disks individually outside of ZFS once I’m done with this operation in order to figure out whether any show problems. If not, they’ll go in the spares bin.
I put the low IOPS disk in a good USB 3 enclosure, hooked to an on-CPU USB controller. Now things are flipped:
capacity operations bandwidth
pool alloc free read write read write
------------------------------------ ----- ----- ----- ----- ----- -----
storage-volume-backup 12.6T 3.74T 0 563 0 293M
mirror-0 12.6T 3.74T 0 563 0 293M
wwn-0x5000c500e8736faf - - 0 406 0 146M
wwn-0x5000c500e8737337 - - 0 156 0 146M
You might be right about the link problem.
Looking at the B350 diagram, the whole chipset is hooked via PCIe 3.0 x4 link to the CPU. The other pool (the source) is hooked via USB controller on the chipset. The SATA controller is also on the chipset so it also shares the chipset-CPU link. I’m pretty sure I’m also using all the PCIe links the chipset provides for SSDs. So that’s 4GB/s total for the whole chipset. Now I’m probably not saturating the whole link, in this particular workload, but perhaps there’s might be another related bottleneck.
Turns out the on-CPU SATA controller isn’t available when the NVMe slot is used. 🫢 Swapped SATA ports, no diff. Put the low IOPS disk in a good USB 3 enclosure, hooked to an on-CPU USB controller. Now things are flipped:
capacity operations bandwidth
pool alloc free read write read write
------------------------------------ ----- ----- ----- ----- ----- -----
storage-volume-backup 12.6T 3.74T 0 563 0 293M
mirror-0 12.6T 3.74T 0 563 0 293M
wwn-0x5000c500e8736faf - - 0 406 0 146M
wwn-0x5000c500e8737337 - - 0 156 0 146M
Interesting. SMART looks pristine on both drives. Brand new drives - Exos X22. Doesn’t mean there isn’t an impending problem of course. I might try shuffling the links to see if that changes the behaviour on the suggestions of the other comment. Both are currently hooked to an AMD B350 chipset SATA controller. There are two ports that should be hooked to the on-CPU SATA controller. I imagine the two SATA controllers don’t share bandwidth. I’ll try putting one disk on the on-CPU controller.
Don’t let be called a hypocrite - give $5. 😆
Here’s a visual inspiration:
Yes, yes I would use ZFS if I had only one file on my disk.
OK, I think it may have to do with the odd number of data drives. If I create a raidz2 with 4 of the 5 disks, even with ashift=12
, recordsize=128K
, the performance in sequential single thread read is stellar. What’s not clear is why this doesn’t affect, or not as much, the 4x 8TB-drive raidz1.
Found the bit counter
Here’s the box test thread if you’re curious. 😊
I think I’ve seen this hypothesis too and it makes sense to me.
If I’m building a new AMD system today, I’d look for a board that exposes more of the chipset-provided USB ports. Otherwise I’d budget for a high quality 4-port PCIe USB controller, if I’m planning to rely a lot on USB on that system.
This article provides some context. Now I do have the latest firmware which should have these fixes but they don’t seem to be foolproof. I’ve seen reports around the web that the firmware improves things but doesn’t completely eliminate them.
If you’ve seen devices disconnecting and reconnecting on occasion, it could be it.
I’ve been on the USB train since 2019.
You’re exactly right, you gotta get devices with good USB-to-SATA chipsets, and you gotta keep them cool.
I’ve been using a mix of WD Elements, WD MyBook and StarTech/Vantec enclosures (ASM1351). I’ve had to cool all the chipsets on WD because they like bolt the PCBs straight to the drive so it heats up from it.
From all my testing I’ve discovered that:
I like this box in particular because it uses a very straightforward design. It’s got 4x ASM235CM with cooling connected to a VIA hub. It’s got a built-in power supply, fan, it even comes with good cables. It fixes a lot of the system variables to known good values. You’re left with connecting it to a good USB host controller.
I thought about it, but it typically requires extra PCIe cards that I can’t rely on as there’s no space in one of the machines and no PCIe slots in the other. That’s why I did a careful search till I stumbled upon this particular enclosure and then I tested one with ZFS for over a week before buying the rest.
You want ASMedia ASM1351 (heatsinked) or ASM235CM on the device side 🥹
This box has 4x ASM235CM and from the testing I’ve conducted over the last week it seems rock solid, so long as it’s not connected to the Ryzen’s built-in USB controller. It’s been flawless on the B350 chipset’s USB controller.
Thanks for the warning ⚠️🙏
This isn’t my first rodeo with ZFS on USB. I’ve been running USB for a few years now. Recently I ran this particular box through a battery of tests and I’m reasonably confident that with my particular set of hardware it’ll be fine. It passed everything I threw at it, once connected to a good port on my machine. But you’re generally right and as you can see I discussed that in the testing thread, and I encountered some issues that I managed to solve. If you think I’ve missed something specific - let me know! 😊
This is one of those situations when you just nod and take the endorsement.