Remix.run Logo
waffletower 2 days ago

The author decidedly has expert syndrome -- they deny both the history and rational behind memory units nomenclature. Memory measurements evolved utilizing binary organizational patterns used in computing architectures. While a proud French pedant might agree with the decimal normalization of memory units discussed, it aligns more closely to the metric system, and it may have benefits for laypeople, it fails to account for how memory is partitioned in historic and modern computing.

PeterStuer a day ago | parent | next [-]

I do agree up to a point as I still need to double take when I see MiB, but that said, I also do agree keeping SI unit prefixes standardized has great advantages.

So the "sane" options would be either not using SI for digital, or, what was chosen, change the colloquial prefixes in the digital world. The former would have been easier in the short term.

philipswood 2 days ago | parent | prev | next [-]

Yes, tomato's ARE actually a fruit.

But really!?

I'll keep calling it in nice round powers of two, thank you very much.

simondotau 2 days ago | parent | next [-]

Even more weirdly, pumpkins are berries. But that’s a botanical definition. In the kitchen they (and tomatoes) are classified as vegetables.

KellyCriterion a day ago | parent [-]

Same with cucumbers and a lot more "plants" :-)

assimpleaspossi a day ago | parent | prev [-]

Yes. Tomatoes are a fruit because the science says so. That non-scientific people call it something else does not change facts.

TonyStr a day ago | parent | next [-]

Depends if you're using the botanical definition or the (more common) culinary definition[0].

I would argue fruit and fruit are two words, one created semasiologically and the other created onomasiologically. Had we chosen a different pronunciation for one of those words, there would be no confusion about what fruits are.

[0] - https://en.wikipedia.org/wiki/Fruit#Botanical_vs._culinary

D-Machine 12 hours ago | parent [-]

Yup. Though rather than say "fruit and fruit" are two words, or focusing on "definitions" (which tend to morph over time anyway), I think the more straightforward and typical approach is to just recognize that the same word can have different meanings in different contexts.

This is such a basic and universal part of language, it is a mystery to me why something so transparently clueless as "actually, tomato is a fruit" persists.

account42 a day ago | parent | prev | next [-]

Definitions that don't reflect peoples usage are not very useful definition.

worthless-trash 3 hours ago | parent [-]

Just because someone is wrong doesn't mean we need to reinforce their error.

whobre a day ago | parent | prev | next [-]

Context matters…

deadwanderer a day ago | parent | prev [-]

Knowledge is understanding that tomatoes are a fruit. Wisdom is understanding that they don't belong in a fruit salad.

Or...

Knowledge is understanding that ketchup is tomato jelly. Wisdom is refraining from putting it on your peanut butter and jelly sandwich.

happymellon a day ago | parent [-]

> Knowledge is understanding that ketchup is tomato jelly

How is it a jelly? It lacks any defining feature of jelly.

D-Machine 12 hours ago | parent [-]

I mean, a jelly is just broadly any thickened sweet goop (doesn't even have to be fruit, and is often allowed to have some savoury/umami, e.g. mint jelly or red pepper jelly). Usually a jelly also is relatively clear and translucent, as it is made with puree / concentrate strained to remove large fibers, but this isn't really a strict requirement, and the amount of straining / translucency is generally just a matter of degree. There are opaque jellies out there, and jellies with bits and pieces.

Ketchup has essentially all the key defining features of a jelly, technically, just is more fibrous / opaque and savoury than most typical jellies.

But, of course, calling a ketchup "jelly", due to such technical arguments, is exactly as dumb as saying "ayktually, tomato is a fruit": both are utterly clueless to how these words are actually used in culinary contexts.

ozozozd 2 days ago | parent | prev | next [-]

It’s not them denying it, it’s the LLM that generated this slop.

All they had to say was that the KiB et. al. were introduced in 1998, and the adoption has been slow.

And not “but a kilobyte can be 1000,” as if it’s an effort issue.

kevin_thibedeau 2 days ago | parent | next [-]

They are managed by different standards organizations. One doesn't like the other encroaching on its turf. "kilo" has only one official meaning as a base-10 scalar.

dietr1ch 2 days ago | parent | next [-]

I don't think of base 10 being meaningful in binary computers. Indexing 1k needs 10 bits regardless if you wanted 1000 or 1024, and the base 10 leaves some awkward holes.

In my mind base 10 only became relevant when disk drive manufacturers came up with disks with "weird" disk sizes (maybe they needed to reserve some space for internals, or it's just that the disk platters didn't like powers of two) and realised that a base 10 system gave them better looking marketing numbers. Who wants a 2.9TB drive when you can get a 3TB* drive for the same price?

userbinator 2 days ago | parent | next [-]

At the TB level, the difference is closer to 10%.

Three binary terabytes i.e. 3 * 2^40 is 3298534883328, or 298534883328 more bytes than 3 decimal terabytes. The latter is 298.5 decimal gigabytes, or 278 binary gigabytes.

Indeed, early hard drives had slightly more than even the binary size --- the famous 10MB IBM disk, for example, had 10653696 bytes, which was 167936 bytes more than 10MB --- more than an entire 160KB floppy's worth of data.

mananaysiempre 2 days ago | parent | prev | next [-]

Buy an SSD, and you can get both at the same time!

That is to say, all the (high-end/“gamer”) consumer SSDs that I’ve checked use 10% overprovisioning and achieve that by exposing a given number of binary TB of physical flash (e.g. a “2TB” SSD will have 2×1024⁴ bytes’ worth of flash chips) as the same number of decimal TB of logical addresses (e.g. that same SSD will appear to the OS as 2×1000⁴ bytes of storage space). And this makes sense: you want a round number on your sticker to make the marketing people happy, you aren’t going to make non-binary-sized chips, and 10% overprovisioning is OK-ish (in reality, probably too low, but consumers don’t shop based on the endurance metrics even if they should).

jdsully a day ago | parent | next [-]

"consumers don’t shop based on the endurance metrics even if they should"

Its been well over a decade now and neither I nor anyone I know has ever had an SSD endurance issue. So it seems like the type of problem where you should just go enterprise if you have it.

userbinator 2 days ago | parent | prev [-]

you aren’t going to make non-binary-sized chips

TLC flash actually has a total number of bits that's a multiple of 3, but it and QLC are so unreliable that there's a significant amount of extra bits used for error correction and such.

SSDs haven't been real binary sizes since the early days of SLC flash which didn't need more than basic ECC. (I have an old 16MB USB drive, which actually has a user-accessible capacity of 16,777,216 bytes. The NAND flash itself actually stores 17,301,504 bytes.)

fc417fc802 2 days ago | parent | prev | next [-]

> I don't think of base 10 being meaningful in binary computers.

They communicate via the network, right? And telephony has always been in base 10 bits as opposed to base two eight bit bytes IIUC. So these two schemes have always been in tension.

So at some point the Ki, Mi, etc prefixes were introduced along with b vs B suffixes and that solved the issue 3+ decades ago so why is this on the HN front page?!

A better question might be, why do we privilege the 8 bit byte? Shouldn't KiB officially have a subscript 8 on the end?

purplehat_ 2 days ago | parent [-]

To be fair, the octet as the byte has been dominant for decades. POSIX even has the definition “A byte is composed of a contiguous sequence of 8 bits.” I would wager many software engineers don't even know that a non-octet bytes were a thing, given that college CS curricula typically just teach a byte is 8 bits.

I found some search results about Texas Instruments' digital signal processors using 16-bit bytes, and came across this blogpost from 2017 talking about implementing 16-bit bytes in LLVM: https://embecosm.com/2017/04/18/non-8-bit-char-support-in-cl.... Not sure if they actually implemented it, but that was surprising to me that non octet bytes still exist, albeit in a very limited manner.

Do you know of any other uses for bytes that are not 8 bits?

zinekeller a day ago | parent | next [-]

> Do you know of any other uses for bytes that are not 8 bits?

For "bytes" as the term-of-art itself? Probably not. For "codes" or "words"? 5 bits are the standard in Baudot transmission (in teletype though). 6- and 7-bit words were the standards of the day for very old computers (ASCII is in itself a 7-bit code), especially on DEC-produced ones (https://rabbit.eng.miami.edu/info/decchars.html).

ahazred8ta a day ago | parent | prev | next [-]

Back in the days of Octal notation, there were computers with a 12 bit word size that used sixbit characters (early DEC PDP-8, PDP-5, early CDC machines). 'Byte' was sometimes used for 6- and 9-bit halfword values.

fc417fc802 a day ago | parent | prev [-]

I wanted to reply with a bunch of DSP examples but on further investigation the ones I checked just now seem to very deliberately use the term "data word". That said, the C char type in these cases is one "data word" as opposed to 8 bits; I feel like that ought to count as a non-8-bit byte regardless of the terminology in the docs.

NXP makes a number of audio DSPs with a native 24 bit width.

Microchip still ships chips in the PIC family with instructions of various widths including 12 and 14 bit however I believe the data memory on those chips is either 8 or 16 bit. I have no idea how to classify a machine where the instruction and data memory widths don't match.

Unlike POSIX, C merely requires that char be at least 8 bits wide. Although I assume lots of real world code would break if challenged on that particular detail.

thfuran 2 days ago | parent | prev | next [-]

>I don't think of base 10 being meaningful in binary computers.

Okay, but what do you mean by “10”?

2 days ago | parent | next [-]
[deleted]
dietr1ch 2 days ago | parent | prev [-]

10, not to be confused with 10 or even the weird cousin, 10

jibal a day ago | parent | prev [-]

> I don't think of base 10 being meaningful in binary computers.

First, you implicitly assumed a decimal number base in your comment.

Second: Of course its meaningful. It's also relevant since humans use binary computers and numeric input and output in text is almost always in decimal.

NetMageSCW a day ago | parent | prev [-]

Who was appointed as arbiter of meaning for kilo? And by what right?

yencabulator 16 hours ago | parent [-]

The International Bureau of Weights and Measures, by an agreement between 64 countries that has been in effect for 6 and a half decades by now, and currently officially used by countries representing approximately 95% of the world's population. The work itself started in 1875 with an agreement between 17 countries.

A little late to lawyer that...

jibal a day ago | parent | prev [-]

There's no evidence of an LLM being involved.

Syzygies 2 days ago | parent | prev | next [-]

He probably uses Phillips head screws.

highhedgehog a day ago | parent [-]

wait, what is wrong with that?

tim333 a day ago | parent | next [-]

Dunno but there are two similar but slightly different cross head screw designs https://www.pbswisstools.com/en/news/detail/phillips-and-poz...

fuzzfactor a day ago | parent [-]

Patents at work.

Before the patent on Phillips screws & tools expired, Pozidriv was launched which was different enough to be capable of a bit more torque.

Phillips was for mass-production, Posidriv for mass-production with a little more torque.

Lots of people who wanted that still waited until the Pozidriv patent expired before considering it.

The screws themselves are marked on the head with little ticks so you can tell the difference, but not necessarily the screwdrivers :\

It's good to have the right tool for the job, HP instruments used Posidriv in a number of places.

duncangh a day ago | parent [-]

the type of screw head really doesn't make a difference when I'm hammering them into the wall, I've found xD

a day ago | parent | prev [-]
[deleted]
foobarbecue 2 days ago | parent | prev | next [-]

*rationale

waffletower a day ago | parent [-]

Thanks, noticed after edit disappeared

crazygringo 2 days ago | parent | prev | next [-]

What are you talking about? The article literally fully explains the rationale, as well as the history. It's not "denying" anything. Seems entirely reasonable and balanced to me.

waffletower 2 days ago | parent | next [-]

They are definitely denying the importance of 2-fold partitioning in computing architectures. VM_PAGE_SIZE is not defined with the value of '10000' for good reason (in many operating systems it is set to '16384').

senfiaj 2 days ago | parent | next [-]

That's why I said "usually acceptable depending on the context". In spoken language I also don't like the awkward and unusual pronunciation of "kibi". But I'll still prefer to write in KiB, especially if I document something.

Also If you open major Linux distro task managers, you'll be surprised to see that they often show in decimal units when "i" is missing from the prefix. Many utilities often avoid the confusing prefixes "KB", "MB"... and use "KiB", "MiB"...

crazygringo 2 days ago | parent | prev [-]

No they're not? They very specifically address it.

Why do you keep insisting the author is denying something when the author clearly acknowledges every single thing you're complaining about?

waffletower a day ago | parent [-]

Denying the importance of...

crazygringo a day ago | parent [-]

Which they're not...

waffletower a day ago | parent [-]

by coming to the conclusion they did, they are

crazygringo 4 hours ago | parent [-]

So not denying. You just disagree is all.

So please don't mischaracterize articles in the future simply because you disagree with their conclusions. That's misrepresentation, and essentially straight-up lying.

nixpulvis 2 days ago | parent | prev [-]

Yea I don't understand the issue here. SI is pretty clear, and this post explains the other standard a little bit.

It's really not all that crazy of a situation. What bothers me is when some applications call KiB KB, because they are old or lazy.

reaperducer a day ago | parent | next [-]

because they are old

I keep using "K" for kilobyte because it makes the children angry since they lack the ability to judge meaning from context.

nixpulvis a day ago | parent [-]

You sly dog.

ZoomZoomZoom 2 days ago | parent | prev [-]

...old lazy and wrong! Capital K is for Kelvin.

schiffern 2 days ago | parent [-]

>Capital K is for Kelvin.

It should be "kelvin" here. ;)

Unit names are always lower-case[1] (watt, joule, newton, pascal, hertz), except at the start of a sentence. When referring to the scientists the names are capitalized of course, and the unit symbols are also capitalized (W, J, N, Pa, Hz).

[1] SI Brochure, Section 5.3 "Unit Names" https://www.bipm.org/documents/20126/41483022/SI-Brochure-9-...

fc417fc802 2 days ago | parent | next [-]

Thus there's no ambiguity. kB is power of 10 and KB is clearly not kelvin bytes therefore it's power of two. Doesn't quite fit the SI worldview but I don't see that as a problem.

schiffern a day ago | parent | next [-]

I often see it with "kB" too, so the proposed (ugly) hack doesn't really solve the problem.

I think the author had it just right. There's a lot of inertia, but the traditional way can cause confusion.

xigoi a day ago | parent | prev [-]

This only works with kilobytes, not megabytes and gigabytes.

ZoomZoomZoom a day ago | parent | prev [-]

I was pretty sure I'd be corrected in some manner, being two of the aforementioned three. Thanks.

jibal a day ago | parent | prev [-]

None of your criticisms--which start with an absurd and meaningless ad hominem--apply to the actual content of the article.

Elsewhere you write

> They are definitely denying the importance of 2-fold partitioning in computing architectures.

No, they definitely aren't. There are no words in the article that deny anything at all.