Remix.run Logo
nayuki 20 hours ago

This is one of those cases where marketing is correct.

Oh tell me, if your CPU processes 1 byte in 1 cycle, and it runs at 800 MHz, how many bytes does it process in 1 second?

The answer is 800 million bytes, or 800 (real) megabytes. It cannot be 800 mebibytes. (Equal to 763 mebibytes.)

Similarly, let's say we have a 1-bit Boolean attribute for each person in the world, and the world population is 8 062 000 000 billion people. How many bits do we need in our database? It's 8.062 gigabits, not 8.062 gibibits. (Equal to 7.508 gibibits.)

The telecom industry has always used power-of-1000 prefixes on bits and bits per second. You have a gigabit Ethernet LAN, and assume no protocol overhead. How long does it take to transmit a 4.7 GB (real gigabytes) DVD image? Multiply by 8 to convert from bytes to bits, so that's 37.6 Gb, so that will take 37.6 seconds to transmit. But how long does it take to transmit a "700 MB" (actually MiB) CD image? Well, it's 734 MB (real megabytes), so 5872 Mb, which is 5.872 seconds.

The problem with the abusively overloaded definition that 1 kilobyte = 1024 bytes, 1 megabyte = 1048576 bytes, etc. is that it fails to align with the rest of the metric system, or even how we group decimal numbers into thousands and millions. The computer industry is wrong here.

And now you have the problem that you can't fit a memory dump of "16 GB" of RAM onto a "16 GB" flash memory card, because the former is actually GiB but the latter is real GB.

varjag 19 hours ago | parent [-]

> Oh tell me, if your CPU processes 1 byte in 1 cycle, and it runs at 800 MHz, how many bytes does it process in 1 second?

An interesting metric, used by noone in the Universe except you for the sake of this discussion. But let's entertain this: if the actual CPU speed is 838,860,800Hz, how many bytes does it process in 1 second?

> Similarly, let's say we have a 1-bit Boolean attribute for each person in the world

I have no problem with definition of bits.

> And now you have the problem that you can't fit a memory dump of "16 GB" of RAM onto a "16 GB" flash memory card, because the former is actually GiB but the latter is real GB.

Remarkable circular reasoning, since the Marketing Kilobyte was defined in the 1990s precisely to inflate actual storage sizes without getting class action suits.

> it fails to align with the rest of the metric system

Look, byte is not derived from fundamental units. It is thus not a part of metric system so SI has zero business regulating information storage. On the other hand you can't buy a computer that does not address memory in anything other than powers of two. Nor you could ever buy a 1000 million bytes RAM chip, because they don't ever exist for basic reason that binary computers use 2^n addressable space.

nayuki 18 hours ago | parent [-]

> An interesting metric, used by noone in the Universe except you for the sake of this discussion.

The speed of cryptographic function such as ciphers and hashes are quoted in cycles per byte. This is because in the pure numeric code, without worrying about memory transfer speed, the speed of the crypto algorithm is directly proportional to the CPU clock speed. https://en.wikipedia.org/wiki/Encryption_software#Performanc... . Random example: https://bench.cr.yp.to/results-hash/amd64-hertz.html

> if the actual CPU speed is 838,860,800Hz, how many bytes does it process in 1 second?

If the CPU is 800 MiHz (never heard of that term, lol), then it processes 800 MiB in 1 second. Stated differently, 839 MHz --> 839 MB.

> Remarkable circular reasoning

No, I'm pointed out that the industry has already splintered into two. Your "16 GB" of RAM is a different measure than "16 GB" of HDD or SSD.

> It is thus not a part of metric system so SI has zero business regulating information storage.

If a byte is not derived from fundamental SI units, then it should not take on SI prefixes.

Otherwise, if it takes on prefixes, it should respect the SI definition and not abusively have its own contradictory definition.

> On the other hand you can't buy a computer that does not address memory in anything other than powers of two.

So what? I can use that same logic to argue that all RAM sizes should be quoted in base-2, so I'm buying 1_0000_0000_0000_0000 (base-2) bytes of RAM, right? Clearly base-10 notation is a poor fit, so why not go all the way to base-2?

> Nor you could ever buy a 1000 million bytes RAM chip

It is certainly feasible. You can just cut a bunch of rows at the end. I know how binary decoder gates work.

Also, if you have a computer and put in a 4 GiB stick of RAM and a 2 GiB stick, then you have 6 GiB of addressable memory, which is clearly not a power of 2.

varjag 17 hours ago | parent [-]

> The speed of cryptographic function such as ciphers and hashes are quoted in cycles per byte.

That's called throughput, and denomination for it absolutely doesn't matter. You can measure it in megabytes as well as in MarketingMegabytes.

> If the CPU is 800 MiHz (never heard of that term, lol), then it processes 800 MiB in 1 second.

Plot twist, your 800MHz CPU oscillator would never run at 800,000,000Hz sharp for any substantial stretch of time. And clock specs are typically rounded numbers. That's why this whole example is ridiculous.

> No, I'm pointed out that the industry has already splintered into two.

No shit it did. My point is that it did it for no advantage at all. You could measure storage megabytes in same normal sane megabytes as before, just couldn't lie about it to the customers.

> If a byte is not derived from fundamental SI units, then it should not take on SI prefixes.

Kilo is a Greek prefix, not SI prefix. You can split hairs that it should mean sharp thosuand but it does not exist in terms of computer architecture. Kibi however is completely made up shit used by noone else and it sounds like a wannabe cartoon character.

> It is certainly feasible.

It is not feasible, that's why they aren't ever gonna be made.

> Also, if you have a computer and put in a 4 GiB stick of RAM and a 2 GiB stick, then you have 6 GiB of addressable memory, which is clearly not a power of 2.

It is not a power of 10 either, you should really think this through.