One key point about retention which is not often mentioned, and indeed neither does this article, is that retention is inversely proportional to program/erase cycles and decreases exponentially with increasing temperature. Hence why retention specs are usually X amount of time after Y cycles at Z temperature. Even a QLC SSD that has only been written to once, and kept in a freezer at -40, may hold data for several decades.

Manufacturers have been playing this game with DWPD/TBW numbers too --- by reducing the retention spec, they can advertise a drive as having a higher endurance with the exact same flash. But if you compare the numbers over the years, it's clear that NAND flash has gotten significantly worse; the only thing that has gone up, multiplicatively, is capacity, while endurance and rentention have both gone down by a few orders of magnitude.

For a long time, 10 years after 100K cycles was the gold standard of SLC flash.

Now we are down to several months after less than 1K cycles for QLC.

▲

londons_explore 18 hours ago | parent | next [-]

I'm sad that drives don't have a 'shutdown' command which writes a few extra bytes of ECC data per page into otherwise empty flash cells.

It turns out that a few extra bytes can turn a 1 year endurance into a 100 year endurance.

▲

adrian_b 16 hours ago | parent | next [-]

There are programs with which you can add any desired amount of redundancy to your backup archives, so that they would survive corruption that does not affect a greater amount of data than the added redundancy.

For instance, on Linux there is par2cmdline. For all my backups, I create pax archives, which are then compressed, then encrypted, then expanded with par2create, then aggregated again in a single pax file (the legacy tar file formats are not good for faithfully storing all metadata of modern file systems and each kind of tar program may have proprietary non-portable extensions to handle this, therefore I use only the pax file format).

Besides that, important data should be replicated and stored on 2 or even 3 SSDs/HDDs/tapes, which should preferably be stored themselves in different locations.

▲

antonkochubey 15 hours ago | parent | next [-]

Unfortunately some SSD controllers plainly refuse to read data they consider corrupted, even if you have extra parity that could potentially restore corrupted data, your entire drive might refuse to read.

▲

lazide 14 hours ago | parent [-]

Huh?

The issue being discussed is random blocks, yes?

If your entire drive is bricked, that is an entirely different issue.

▲

jeremyvisser 14 hours ago | parent [-]

Here’s the thing. That SSD controller is the interface between you and those blocks.

If it decides, by some arbitrary measurement, as defined by some logic within its black box firmware, that it should stop returning all blocks, then it will do so, and you have almost no recourse.

This is a very common failure mode of SSDs. As a consequence of some failed blocks (likely exceeding a number of failed blocks, or perhaps the controller’s own storage failed), drives will commonly brick themselves.

Perhaps you haven’t seen it happen, or your SSD doesn’t do this, or perhaps certain models or firmwares don’t, but some certainly do, both from my own experience, and countless accounts I’ve read elsewhere, so this is more common than you might realise.

▲

cogman10 5 hours ago | parent | next [-]

I really wish this responsibility was something hoisted up into the FS and not a responsibility of the drive itself.

It's ridiculous (IMO) that SSD firmware is doing so much transparent work just to keep the illusion that the drive is actually spinning metal with similar sector write performance.

	▲	immibis 2 hours ago \| parent [-]
		Linux supports raw flash, called an MTD device (memory technology device). It's often used in embedded systems. And it has MTD-native filesystems such as ubifs. But it's only really used in embedded systems because... PC SSDs don't expose that kind of interface. (Nor would you necessarily want them to. A faulty driver would quietly brick your hardware in a matter of minutes to hours)

▲

londons_explore 8 hours ago | parent | prev | next [-]

The mechanism is usually that the SSD controller requires that some work be done before your read - for example rewriting some access tables to record 'hot' data.

That work can't be done because there is no free blocks. However, no space can be freed up because every spare writable block is bad or is in some other unusable state.

The drive is therefore dead - it will enumerate, but neither read nor write anything.

▲

reactordev 13 hours ago | parent | prev | next [-]

This is correct, you still have to go through firmware to gain access to the block/page on “disk” and if the firmware decides the block is invalid than it fails.

You can sidestep this by bypassing the controller on a test bench though. Pinning wires to the chips. At that point it’s no longer an SSD.

▲

lazide 10 hours ago | parent | prev [-]

Yes, and? HDD controllers dying and head crashes are a thing too.

At least in the ‘bricked’ case it’s a trivial RMA - corrupt blocks tend to be a harder fight. And since ‘bricked’ is such a trivial RMA, manufacturers have more of an incentive to fix it or go broke, or avoid it in the first place.

This is why backups are important now; and always have been.

	▲	mort96 8 hours ago \| parent [-]
		We're not talking about the SSD controller dying. The SSD controller in the hypothetical situation that's being described is working as intended.

▲

mywittyname 5 hours ago | parent | prev | next [-]

This is fine, but I'd prefer an option to transparently add parity bits to the drive, even if it means losing access to capacity.

Personally, I keep backups of critical data on a platter disk NAS, so I'm not concerned about losing critical data off of an SSD. However, I did recently have to reinstall Windows on a computer because of a randomly corrupted system file. Which is something this feature would have prevented.

▲

casenmgreen 4 hours ago | parent | prev [-]

Thank you for this.

I had no knowledge of pax, or that par was an open standard, and I care about what they help with. Going to switch over to using both in my backups.

▲

consp 18 hours ago | parent | prev | next [-]

Blind question with no attempt to look it up: why don't filesystems do this? It won't work for most boot code but that is relatively easy to fix by plugging it in somewhere else.

▲

lxgr 16 hours ago | parent | next [-]

Wrong layer.

SSDs know which blocks have been written to a lot, have been giving a lot of read errors before etc., and often even have heterogeneous storages (such as a bit of SLC for burst writing next to a bunch of MLC for density).

They can spend ECC bits much more efficiently with that information than a file system ever could, which usually sees the storage as a flat, linear array of blocks.

▲

adrian_b 15 hours ago | parent | next [-]

This is true, but nevertheless you cannot place your trust only in the manufacturer of the SSD/HDD, as I have seen enough cases when the SSD/HDD reports no errors, but nonetheless it returns corrupted data.

For any important data you should have your own file hashes, for corruption detection, and you should add some form of redundancy for file repair, either with a specialized tool or simply by duplicating the file on separate storage media.

A database with file hashes can also serve other purposes than corruption detection, e.g. it can be used to find duplicate data without physically accessing the archival storage media.

	▲	lxgr 15 hours ago \| parent [-]
		Verifying at higher layers can be ok (it's still not ideal!), but trying to actively fix things below that are broken usually quickly becomes a nightmare.

▲

DeepSeaTortoise 14 hours ago | parent | prev [-]

IMO it's exactly the right layer, just like for ECC memory.

There's a lot of potential for errors when the storage controller processes and turns the data into analog magic to transmit it.

In practice, this is a solved problem, but only until someone makes a mistake, then there will be a lot of trouble debugging it between the manufacturer certainly denying their mistake and people getting caught up on the usual suspects.

Doing all the ECC stuff right on the CPU gives you all the benefits against bitrot and resilience against all errors in transmission for free.

And if all things go just right we might even be getting better instruction support for ECC stuff. That'd be a nice bonus

▲

lxgr 14 hours ago | parent | next [-]

> There's a lot of potential for errors when the storage controller processes and turns the data into analog magic to transmit it.

That's a physical layer, and as such should obviously have end-to-end ECC appropriate to the task. But the error distribution shape is probably very different from that of bytes in NAND data at rest, which is different from that of DRAM and PCI again.

For the same reason, IP does not do error correction, but rather relies on lower layers to present error-free datagram semantics to it: Ethernet, Wi-Fi, and (managed-spectrum) 5G all have dramatically different properties that higher layers have no business worrying about. And sticking with that example, once it becomes TCP's job to handle packet loss due to transmission errors (instead of just congestion), things go south pretty quickly.

▲

johncolanduoni 11 hours ago | parent [-]

> And sticking with that example, once it becomes TCP's job to handle packet loss due to transmission errors (instead of just congestion), things go south pretty quickly.

Outside of wireless links (where FEC of some degree is necessary regardless) this is mostly because TCP’s checksum is so weak. QUIC for example handles this much better, since the packet’s authenticated encryption doubles as a robust error detecting code. And unlike TLS over TCP, the connection is resilient to these failures: a TCP packet that is corrupted but passes the TCP checksum will kill the TLS connection on top of it instead of retransmitting.

	▲	lxgr 11 hours ago \| parent [-]
		Ah, I meant go south in terms of performance, not correctness. Most TCP congestion control algorithms interpret loss exclusively as a congestion signal, since that's what most lower layers have historically presented to it. This is why newer TCP variants that use different congestion signals can deal with networks that violate that assumption better, such as e.g. Starlink: https://blog.apnic.net/2024/05/17/a-transport-protocols-view... Other than that, I didn't realize that TLS has no way of just retransmitting broken data without breaking the entire connection (and a potentially expensive request or response with it)! Makes sense at that layer, but I never thought about it in detail. Good to know, thank you.

▲

johncolanduoni 12 hours ago | parent | prev [-]

ECC memory modules don’t do their own very complicated remapping from linear addresses to physical blocks like SSDs do. ECC memory is also oriented toward fixing transient errors, not persistently bad physical blocks.

▲

londons_explore 17 hours ago | parent | prev | next [-]

The filesystem doesn't have access to the right existing ECC data to be able to add a few bytes to do the job. It would need to store a whole extra copy.

There are potentially ways a filesystem could use heirarchical ECC to just store a small percentage extra, but it would be far from theoretically optimal and rely on the fact just a few logical blocks of the drive become unreadable, and those logical blocks aren't correlated in write time (which I imagine isn't true for most ssd firmware).

▲

mrspuratic 13 hours ago | parent | next [-]

CD storage has an interesting take, the available sector size varies by use, i.e. audio or MPEG1 video (VideoCD) at 2352 data octets per sector (with two media level ECCs), actual data at 2048 octets per sector where the extra EDC/ECC can be exposed by reading "raw". I learned this the hard way with VideoPack's malformed VCD images, I wrote a tool to post-process the images to recreate the correct EDC/ECC per sector. Fun fact, ISO9660 stores file metadata simultaneously in big-endian and little form (AFAIR VP used to fluff that up too).

▲

xhkkffbf 9 hours ago | parent [-]

Octets? Don't you mean "bytes"? Or is that word problematic now?

	▲	theragra 7 hours ago \| parent \| next [-]
		I wonder if OP used "octets" because physical pattern in the CD used to represent a byte is a sequence of 17 pits and lands. BTW, byte size during the history varied from 4 to 24 bit! Even now, based on interpretation, you can say 16 bit bytes do exist. Char type can be 16 bit on some DSP systems. I was curious, so I checked. Before this comment, I only knew about 7 bit bytes.
	▲	asveikau 6 hours ago \| parent \| prev \| next [-]
		The term octets is pretty common in network protocol RFCs, maybe their vocabulary is biased in the direction of that writing.
	▲	ralferoo 7 hours ago \| parent \| prev [-]
		Personally, I prefer the word "bytes", but "octets" is technically more accurate as there are systems that use differently sized bytes. A lot of these are obsolete but there are also current examples, for example in most FPGA that provide SRAM blocks, it's actually arranged as 9, 18 or 36-bit wide with the expectation that you'll use the extra bits for parity or flags of some kind.

▲

lazide 14 hours ago | parent | prev [-]

Reed Solomon codes, or forward error correction is what you’re discussing. All modern drives do it at low levels anyway.

It would not be hard for a COW file system to use them, but it can easily get out of control paranoia wise. Ideally you’d need them for every bit of data, including metadata.

That said, I did have a computer that randomly bit flipped when writing to storage sometimes (eventually traced it to an iffy power supply), and PAR (a type of reed solomon coding forward error correction library) worked great for getting a working backup off the machine. Every other thing I tried would end up with at least a couple bit flip errors per GB, which make it impossible.

▲

DeepSeaTortoise 17 hours ago | parent | prev | next [-]

You can still do this for boot code if the error isn't significant enough to make all of the boot fail. The "fixing it by plugging it in somewhere else" could then also be simple enough to the point of being fully automated.

ZFS has "copies=2", but iirc there are no filesystems with support for single disk erasure codes, which is a huge shame because these can be several orders of magnitude more robust compared to a simple copy for the same space.

	▲	thebiss 4 hours ago \| parent [-]
		zfs can run with a single disk stripe. pfsense gladly runs this way. See https://docs.netgate.com/pfsense/en/latest/install/install-z...

▲

immibis an hour ago | parent | prev [-]

You can, but only if your CPU is directly connected to a flash chip with no controller in the way. Linux calls it the mtd subsystem (memory technology device).

▲

victorbjorklund 17 hours ago | parent | prev | next [-]

That does sound like a good idea (even if I’m sure some very smart people know why it would be a bad idea)

▲

alex_duf 14 hours ago | parent | prev [-]

I guess the only way to do this today is with a raid array?

▲

ACCount37 18 hours ago | parent | prev | next [-]

Because no one is willing to pay for SLC.

Those QLC NAND chips? Pretty much all of them have an "SLC mode", which treats each cell as 1 bit, and increases both write speeds and reliability massively. But who wants to have 4 times less capacity for the same price?

▲

userbinator 18 hours ago | parent | next [-]

4 times less capacity but 100x or more endurance or retention at the same price looks like a great deal to me. Alternatively: do you want to have 4x more capacity at 1/100th the reliability?

Plenty of people would be willing to pay for SLC mode. There is an unofficial firmware hack that enables it: https://news.ycombinator.com/item?id=40405578

1TB QLC SSDs are <$100 now. If the industry was sane, we would have 1TB SLC SSDs for less than $400, or 256GB ones for <$100, and in fact SLC requires less ECC and can function with simpler (cheaper, less buggy, faster) firmware and controllers.

But why won't the manufacturers let you choose? The real answer is clearly planned obsolescence.

I have an old SLC USB drive which is only 512MB, but it's nearly 20 years old and some of the very first files I wrote to it are still intact (I last checked several months ago, and don't expect it's changed since then.) It has probably had a few hundred full-drive-writes over the years --- well worn-out by modern QLC/TLC standards, but barely-broken-in for SLC.

▲

ACCount37 17 hours ago | parent | next [-]

The real answer is: no one actually cares.

Very few people have the technical understanding required to make such a choice. And of those, fewer people still would actually pick SLC over QLC.

At the same time: a lot of people would, if facing a choice between a $50 1TB SSD and a $40 1TB SSD, pick the latter. So there's a big incentive to optimize on cost, and not a lot of incentive to optimize on anything else.

This "SLC only" mode exists in the firmware for the sake of a few very specific customers with very specific needs - the few B2B customers that are actually willing to pay that fee. And they don't get the $50 1TB SSD with a settings bit flipped - they pay a lot more, and with that, they get better QC, a better grade of NAND flash chips, extended thermal envelopes, performance guarantees, etc.

Most drives out there just use this "SLC" mode for caches, "hot spot" data and internal needs.

	▲	volemo 17 hours ago \| parent \| next [-]
		Agreed. I have some technical understanding of SLC’s advantages, but why would I choose it over QLC? My file system has checksums on data and metadata, my backup strategy is solid, my SSD is powered most days, and before it dies I’ll probably upgrade my computer for other reasons.
	▲	Aurornis 11 hours ago \| parent \| prev [-]
		> Very few people have the technical understanding required to make such a choice. And of those, fewer people still would actually pick SLC over QLC. There was a period of time when you could still by consumer SLC drives and pay a premium for them. I still have one. Anyone assuming the manufacturers are missing out on a golden market opportunity of hidden SLC drive demand is missing the fact that they already offered these. They know how well (or rather, how poorly) they sell. Even if consumers had full technical knowledge to make decisions, most would pick the TLC and QLC anyway. Some of these comments are talking about optimizing 20 year old drives for being used again two decades later, but ignoring the fact that a 20 year old drive is nearly useless and could be replaced by a superior option for $20 on eBay. The only thing that would change, practically speaking, is that people looking for really old files on drives they haven’t powered up for 20 years wouldn’t be surprised that the were missing. The rest of us will do just fine with our TLC drives and actual backups to backup services or backup mediums. I’ll happily upgrade my SSD every 4-5 years and enjoy the extra capacity over SLC while still coming out money ahead and not losing data.

▲

Sohcahtoa82 8 hours ago | parent | prev | next [-]

> But why won't the manufacturers let you choose? The real answer is clearly planned obsolescence.

No, it's not. The real answer is that customers (Even B2B) are extremely price sensitive.

Look, I know the prevailing view is that lower quality is some evil corporate plan to get you to purchase replacements on a more frequent basis, but the real truth is that consumers are price sensitive, short sighted, and often purchasing without full knowledge. There's a race to the bottom on price, which means quality suffers. You put your typical customer in front of two blenders at the appliance store, one is $20 and the other is $50, most customers will pick the $20 one, even when armed with the knowledge that the $50 version will last longer.

When it comes to QLC vs SLC, buyers don't care. They just want the maximum storage for the smallest price.

	▲	unethical_ban 7 hours ago \| parent [-]
		For your specific example, I would buy the $20 because I would assume the $50 is just as bad. Having built computers casually for some time, I never recall being told by the marketing department or retailer that one kind of SSD was more reliable than another. The only thing that is ever advertised blatantly is speed and capacity. I saw the kind of SSD sometimes, but it was never explained what that meant to a consumer (the same way SMR hard drives were never advertised as having slow reads) If I saw "this kind of SSD is reliable for 10 years and the other one is reliable for 2" then I may have made a decision based on that.

▲

mort96 8 hours ago | parent | prev | next [-]

> do you want to have 4x more capacity at 1/100th the reliability?

Yes.

QLC SSDs are reliable enough for my day-to-day use, but even QLC storage is quite expensive and I wouldn't want to pay 4x (or realistically, way more than 4x) to get 2TB SLC M.2 drives instead of 2TB QLC M.2 drives.

▲

big-and-small 18 hours ago | parent | prev | next [-]

Funny enough I just managed to find this exact post and comment on google 5 minutes ago when I started wondering whatever it's actually possible to use 1/4 of capacity in SLC mode.

Though what make me wonder is that some reviews of modern SSDs certainly mention that that pSCL is somewhat less than 25% of capacity, like 400GB pSLC cache for 2TB SSD:

https://www.tomshardware.com/pc-components/ssds/crucial-p310...

So you get more like 20% of SLC capacity at least on some SSDs

▲

kvemkon 12 hours ago | parent | prev | next [-]

NVMe protocol introduced namespaces. Is it not the feature perfect for users to decide themselves, how to create 2 virtual SSDs with TLC and pseudo-SLC-mode, choosing how much space to sacrifice for pSLC?

	▲	wmf 7 hours ago \| parent [-]
		Most people want to use pSLC as cache or as the whole drive, not as a separate namespace.

▲

Aurornis 11 hours ago | parent | prev | next [-]

> Alternatively: do you want to have 4x more capacity at 1/100th the reliability?

If the original drive has sufficient reliability, then yes I do want that.

And the majority of consumers do, too.

Chasing absolute extreme highest powered off durability is not a priority for 99% of people when the drives work properly for typical use cases. I have 5 year old SSDs where the wear data is still in the single digit percentages despite what I consider moderately heavy use.

> I have an old SLC USB drive which is only 512MB, but it's nearly 20 years old and some of the very first files I wrote to it are still intact (I last checked several months ago, and don't expect it's changed since then.) It has probably had a few hundred full-drive-writes over the years --- well worn-out by modern QLC/TLC standards, but barely-broken-in for SLC.

Barely broken in, but also only 512MB, very slow, and virtually useless by modern standards. The only positive is that the files are still intact on that old drive you dusted off.

This is why the market doesn’t care and why manufacturers are shipping TLC and QLC: They aren’t doing a planned obsolescence conspiracy. They know that 20 years from now or even 10 years from now that drive is going to be so outdated that you can get a faster, bigger new one for pocket change.

▲

throwaway290 14 hours ago | parent | prev | next [-]

> I have an old SLC USB drive which is only 512MB, but it's nearly 20 years old and some of the very first files I wrote to it are still intact (I last checked several months ago

It's not about age of drive. It's how much time it spent without power.

▲

justsomehnguy 15 hours ago | parent | prev [-]

> If the industry was sane

Industry is sane in both the common and capitalist sense.

The year 2025 and people still buy 256Tb USB thumbdrives for $30, because nobody cares except for the price.

▲

big-and-small 18 hours ago | parent | prev | next [-]

To be honest you can buy 4TB SSD for $200 now, so I guess market would be larger if people were aware of how easy would it be to make such SSDs work in SLC mode exclusively.

▲

anthk 15 hours ago | parent | prev | next [-]

Myself wants. I remember when the UBIFS module (or some kernel settings) for the Debian kernel was MLC against SLC. You could store 4X more data now, but at a cost of really bad reability: A SINGLE bad shutdown and your partitions would be corrupted up to the point of not being able to properly boot any more, having to reflash the NAND.

	▲	moffkalast 11 hours ago \| parent \| next [-]
		Well then buy an industrial SSD, they're something like 80-240 GB and you get power loss protection capacitors too. Just not the datacenter ones, those melt immediately without rack airflow.
	▲	11 hours ago \| parent \| prev [-]
		[deleted]

▲

17 hours ago | parent | prev [-]

[deleted]

▲

RachelF a day ago | parent | prev | next [-]

Endurance going down is hardly a surprise given that the feature size has gone down too. The same goes for logic and DRAM memory.

I suspect that 2035 years time, hardware from 2010 will work, while that from 2020 will be less reliable.

▲

lotrjohn a day ago | parent | next [-]

Completely anecdotal, and mostly unrelated, but my NES from 1990 is still going strong. Two PS3’s that I have owned simply broke.

CRTs from 1994 and 2002 still going strong. LCD tvs from 2012 and 2022 just went kaput for no reason.

Old hardware rocks.

▲

userbinator 19 hours ago | parent | next [-]

LCD tvs from 2012 and 2022 just went kaput for no reason.

Most likely bad capacitors. The https://en.wikipedia.org/wiki/Capacitor_plague may have passed, but electrolytic capacitors are still the major life-limiting component in electronics.

	▲	londons_explore 18 hours ago \| parent \| next [-]
		MLCC's look ready to take over nearly all uses of electrolytics. They still degrade with time, but in a very predictable way. That makes it possible to build a version of your design with all capacitors '50 year aged' and check it still works. Sadly no engineering firm I know does this, despite it being very cheap and easy to do.
	▲	DoesntMatter22 5 hours ago \| parent \| prev [-]
		Looks like that plague stopped in 2007? I have a 8 year old LCD that died out of nowhere as well, So I'm guessing wouldn't be affected by this. Could still be a capacitor issue though

▲

theragra 7 hours ago | parent | prev | next [-]

I had an LCD that worked from around 2005 to 2022. It became very yellow closer to 2022 for some reason. It was Samsung PVA, I think it was model 910T.

▲

amiga-workbench 6 hours ago | parent [-]

Its old enough to use a CFL backlight and those turn yellow with age.

	▲	theragra 3 hours ago \| parent [-]
		Thanks ;)

▲

Dylan16807 20 hours ago | parent | prev | next [-]

For what it's worth my LCD monitor from 2010 is doing well. I think the power supplied died at one point but I already had a laptop supply to replace it with.

▲

dfex 19 hours ago | parent | prev [-]

Specifically old Japanese hardware from the 80s and 90s - this stuff is bulletproof

	▲	jacquesm 19 hours ago \| parent [-]
		I still have a Marantz amp from the 80's that works like new, it hasn't even been recapped.

▲

bullen 18 hours ago | parent | prev | next [-]

I concur; in my experience ALL my 24/7 drives from 2009-2013 still work today and ALL my 2014+ are dead, started dying after 5 years, last one died 9 years later. Around 10 drives in each group. All older drives are below 100GB (SLC) all never are above 200GB (MLC). I reverted back to older drives for all my machines in 2021 after scoring 30x unused X25-E on ebay.

The only MLC I use today are Samsungs best industrial drives and they work sort of... but no promises. And SanDisc SD cards that if you buy the cheapest ones last a surprising amount of time. 32GB lasted 11-12 years for me. Now I mostly install 500GB-1TB ones (recently = only been running for 2-3 years) after installing some 200-400GB ones that work still after 7 years.

▲

Aurornis 11 hours ago | parent [-]

> in my experience ALL my 24/7 drives from 2009-2013 still work today and ALL my 2014+ are dead,

As a counter anecdote, I have a lot of SSDs from the late 2010s that are still going strong, but I lost some early SSD drives to mysterious and unexpected failures (not near the wear-out level).

	▲	bullen 8 hours ago \| parent [-]
		Interesting, what kind where they? Mine where all Intel.

▲

Dylan16807 a day ago | parent | prev | next [-]

As far as I'm aware flash got a bit of a size boost when it went 3D and hasn't shrunk much since then. If you use the same number of bits per cell, I don't know if I would expect 2010 and 2020 or 2025 flash to vary much in endurance.

For logic and DRAM the biggest factors are how far they're being pushed with voltage and heat, which is a thing that trends back and forth over the years. So I could see that go either way.

▲

robotnikman 8 hours ago | parent | prev | next [-]

I recently found a 1GB USB drive from around 2006 I used to use. I plugged in and most of the files were still readable! There were some that were corrupted and unreadable unfortunately.

▲

tensility 9 hours ago | parent | prev [-]

Oh, it would be nice if it were just feature size. Over the prior 15 years, the nand industry has doubled its logical density three times over with the trick of encoding more than one bit per physical voltage well, making the error bounds on leaking wells tighter and tighter and amplifying the bit rot impact, in number of ECC corrections consumed, per leaked voltage well.

▲

hxorr a day ago | parent | prev | next [-]

I also seem to remember reading retention is proportional to temperature at time of write. Ie, best case scenario = write data when drive is hot, and store in freezer. Would be happy if someone can confirm or deny this.

▲

pbmonster 19 hours ago | parent | next [-]

I know we're talking theoretical optimums here, but: don't put your SSDs in the freezer. Water ingress because of condensation will kill your data much quicker than NAND bit rot at room temperature.

▲

cesaref 13 hours ago | parent | next [-]

I'm interested in why SSDs would struggle with condensation. What aspect of the design is prone to issues? I routinely repair old computer boards, replace leaky capacitors, that sort of thing, and have cleaned boards with IPA and rinsed in tap water without any issues to anything for many years.

	▲	Aurornis 11 hours ago \| parent [-]
		Condensation == water. Water leads to corrosion.

▲

dachris 19 hours ago | parent | prev | next [-]

Would an airtight container and liberal addition of dessicants help?

▲

pbmonster 18 hours ago | parent | next [-]

Sure. Just make sure the drive is warm before you take it out of the container - because this is when the critical condensation happens: you take out a cold drive an expose it to humid room temperature air. Then water condenses on (and in) the cold drive.

Re-freezing is also critical, the container should contain no humid air when it goes into the freezer, because the water will condense and freeze as the container cools. A tightly wrapped bag, desiccant and/or purging the container with dry gas would prevent that.

▲

mkesper 16 hours ago | parent [-]

A vacuum sealer would probably help to avoid the humid air, too.

	▲	mort96 8 hours ago \| parent [-]
		Only if you wait for the drive to heat up before you remove the vacuum seal.

▲

Aurornis 10 hours ago | parent | prev [-]

Be careful, airtight doesn’t mean it’s not moisture permeable over time.

Color changing desiccant is a good idea to monitor it.

▲

Onavo 19 hours ago | parent | prev [-]

What about magnetic tape?

	▲	pbmonster 19 hours ago \| parent [-]
		For long term storage? Sure, everybody does it. In the freezer? Better don't, for the same reason. There are ways to keep water out of frozen/re-frozen items, of course, but if you mess up you have water everywhere.

▲

userbinator a day ago | parent | prev | next [-]

That's probably this: https://www.sciencedirect.com/science/article/abs/pii/S00262...

▲

CarVac a day ago | parent | prev [-]

I definitely remember seeing exactly this.

▲

aidenn0 4 hours ago | parent | prev | next [-]

On the other hand when capacity goes up, the cycle-count goes down for the same workload. A 4TB drive after 1K cycles has written the same amount of data as 100GB drive after 40K cycles.

▲

karczex 19 hours ago | parent | prev | next [-]

That's how it has to work. To increase capacity you have to make smaller cells where charge may easier diffuse from one cell to another. Also to make drive faster, stored charge has to be smaller, which also decrease endurance. With SLC and QLC comparison is even worse as QLC is basically clever hack to store 4 times more data in the same number physical cells - it's tradeoff.

▲

bullen 19 hours ago | parent [-]

Yes, but that tradeoff comes with a hidden cost: complexity!

I much rather have 64GB of SLC at 100K WpB than 4TB of MLC at less than 10K WpB.

The spread functions that move bits around to even the writes or caches will also fail.

The best compromise is of course to use both kinds for different purposes: SLC for small main OS (that will inevitably have logs and other writes) and MLC for slowly changing large data like a user database or files.

The problem is now you cannot choose because the factories/machines that make SLC are all gone.

▲

userbinator 18 hours ago | parent | next [-]

The problem is now you cannot choose because the factories/machines that make SLC are all gone.

You can still get pure SLC flash in smaller sizes, or use TLC/QLC in SLC mode.

I much rather have 64GB of SLC at 100K WpB than 4TB of MLC at less than 10K WpB.

It's more like 1TB of SLC vs. 3TB of TLC or 4TB of QLC. All three take the same die area, but the SLC will last a few orders of magnitude longer.

	▲	karczex 16 hours ago \| parent [-]
		SLC are produced, but the issue is that there is no (I'm aware of) SLC products for consumer market

▲

mort96 8 hours ago | parent | prev [-]

My problem is: I have more than 64GB of data

▲

awei 9 hours ago | parent | prev | next [-]

So AWS S3 Glacier might actually be cold

▲

nutjob2 a day ago | parent | prev [-]

> Even a QLC SSD that has only been written to once, and kept in a freezer at -40, may hold data for several decades.

So literally put your data in cold storage.

	▲	dotancohen 17 hours ago \| parent [-]
		That is literally the origin of the term.