| ▲ | xyzzy123 14 hours ago | |
Not a pro data guy but someone running something like what you're talking about for many years. These days 200TiB is "normal storage server" territory, not anything exotic. You can just do the most boring thing and it will be fine. I'm just running 1, tho. The hard parts are having it be efficient, quiet and cheap which always feels like an impossible triangle. Yeah, resilvers will take 24h if your pool is getting full but with RAIDZ2 it's not that scary. I'm running TrueNAS scale. I used to just use Ubuntu (more flexible!) but over many years I had a some bad upgrades where kernel & zfs stopped being friends. My rack is pretty nearby so for me, a big 4U case with 120mm front fans was high priority, it has a good noise profile if you replace with Noctuas, you get a constant "whoosh" rather than a whine etc. Running 8+2 with 24tb drives. I used to run with 20 slots full of old ex-cloud SAS drives but it's more heat / noise / power intensive. Also, you lose flexibility if you don't have free slots. So eventually ponied up for 24tb disks. It hurt my wallet but greatly reduced noise and power.
It's a super old box but it does fine and will max 10Gbe for sequential and do 10k write iops / 1k random read iops without problems. Not great, not terrible. You don't really need the SLOG unless you plan to run VMs or databases off it.I personally try to run with no more than 10 slots out of 20 used. This gives a bit of flexibility for expanding, auxiliary pools, etc etc. Often you find you need twice as much storage as you're planning on directly using. For upgrades, snapshots, transfers, ad-hoc stuff etc. Re: dedup, I would personally look to dedup at the application layer rather than in the filesystem if I possibly could? If you are running custom archiving software then it's something you'd want to handle in the scope of that. Depends on the data obviously, but it's going to be more predictable, and you understand your data the best. I don't have zfs de-dup turned on but for a 200TiB pool with 128k blocks, the zfs DDT will want like 500GiB ram. Which is NOT cheap in 2026. I also run a 7-node ceph cluster "for funsies". I love the flexibility of it... but I don't think ceph truly makes sense until you have multiple racks or you have hard 24/7 requirements. | ||