Remix.run Logo
schoen 7 days ago

For anyone here who's never pondered it ("today's lucky 10,000"?), there's a lot of intentional structure in the organization of ASCII that comes through readily in binary or hex.

https://altcodeunicode.com/ascii-american-standard-code-for-...

The first nibble (hex digit) shows your position within the chart, approximately like 2 = punctuation, 3 = digits, 4 = uppercase letters, 6 = lowercase letters. (Yes, there's more structure than that considering it in binary.)

For digits (first nibble 3), the value of the digit is equal to the value of the second nibble.

For punctuation (first nibble 2), the punctuation is the character you'd get on a traditional U.S. keyboard layout pressing shift and the digit of the second nibble.

For uppercase letters (first nibble 4, then overflowing into first nibble 5), the second nibble is the ordinal position of the letter within the alphabet. So 41 = A (letter #1), 42 = B (letter #2), 43 = C (letter #3).

Lowercase letters do the same thing starting at 6, so 61 = a (letter #1), 62 = b (letter #2), 63 = c (letter #3), etc.

The tricky ones are the overflow/wraparound into first nibble 5 (the letters from letter #16, P) and into first nibble 7 (from letter #16, p). There you have to actually add 16 to the letter position before combining it with the second nibble, or think of it as like "letter #0x10, letter #0x11, letter #0x12..." which may be less intuitive for some people).

Again, there's even more structure and pattern than that in ASCII, and it's all fully intentional, largely to facilitate meaningful bit manipulations. E.g. converting uppercase to lowercase is just a matter of adding 32, or logical OR with 0x00100000. Converting lowercase to uppercase is just a matter of subtracting 32, or logical AND with 0x11011111.

For reading hex dumps of ASCII, it's also helpful to know that the very first printable character (0x20) is, ironically, blank -- it's the space character.

schoen 7 days ago | parent | next [-]

I should just have put the printable character chart right here in the post for people to compare:

     0 1 2 3 4 5 6 7 8 9 A B C D E F
  ..
  2    ! " # $ % & ' ( ) * + , - . / 
  3  0 1 2 3 4 5 6 7 8 9 : ; < = > ? 
  4  @ A B C D E F G H I J K L M N O 
  5  P Q R S T U V W X Y Z [ \ ] ^ _ 
  6  ` a b c d e f g h i j k l m n o 
  7  p q r s t u v w x y z { | } ~
I don't have a mnemonic for punctuation characters with second nibble >9, or for the backtick. The @ can be remembered via Ctrl+@ which is a way of typing the NUL character, ASCII 00 (also not coincidental; compare to Ctrl+A, Ctrl+B, Ctrl+C... for inputting ASCII 01, 02, 03...).
dhosek 7 days ago | parent [-]

Hex 21 through 29 were the shift characters on the numbers on the old Apple ][ keyboard.

anitil 7 days ago | parent | prev [-]

> the character you'd get on a traditional U.S. keyboard layout

I use a different layout so I'd never realised there was method to the madness! I get the following

$ echo -n ' !@#$%^&*(' | xxd -p 2021402324255e262a28

dhosek 7 days ago | parent | next [-]

It’s more the old TTY layout which differs somewhat from the modified typewriter layout that’s become standard for computer keyboards. The old Apple ][ keyboard had 1–9 corresponding to the next row in ASCII, shift-0 was @, I think other characters were ±16 based on shift. Early ASCII implementations were often slightly inconsistent but codings were often based on keyboard layouts.

userbinator 7 days ago | parent [-]

The order of the punctuation descends from the very first typewriters, in the late 19th century:

https://en.wikipedia.org/wiki/File:Remington_2_typewriter_ke...

schoen 7 days ago | parent | prev [-]

The @ for shift-2 replaced the earlier " which you would see on many 1980s-era PCs.

I forget the story about what changed for shift-6 through shift-9.

When I say "traditional U.S. keyboard layout" I mean to contrast this with the modern one, which is the same as what you and I have.