| ▲ | rmunn 18 hours ago | |
The other tricky part is emojis made up of multiple codepoints with zero-width joiner characters and variation selectors, or other symbols. E.g. is made up of U+1F1FA REGIONAL INDICATOR SYMBOL LETTER U followed by U+1F1F8 REGIONAL INDICATOR SYMBOL LETTER S, or (which should render as a single symbol, a burning heart / heart on fire), which is made up of the four-codepoint sequence U+2764 HEAVY BLACK HEART, U+FE0F VARIATION SELECTOR-16, U+200D ZERO WIDTH JOINER, and U+1F525 FIRE but should only render in one double-width block. Then there are even more complicated sequences like , which again should render in a single block but are made up of six(!) codepoints: U+1F469 WOMAN, U+200D ZERO WIDTH JOINER, U+2764 HEAVY BLACK HEART, U+FE0F VARIATION SELECTOR-16, U+200D ZERO WIDTH JOINER, and U+1F468 MAN. The number of codepoints never did correspond exactly to the number of fixed-width blocks a character should take up (U+00E9 é is the same as U+0065 e plus U+0301 COMBINING ACUTE ACCENT, so it should be rendered in a single block but it might be one or two codepoints depending on whether the text was composed or decomposed before reaching the rendering engine). But with emojis in play, the number of possibilities jumps dramatically, and it's no longer sufficient to just count base characters and ignore diacritics: you have to actually compute the renderings (or pre-calculate them in a good lookup table, which IIRC is what Ghostty does) of all those valid emoji combinations. P.S. The Hacker News comments stripped out those emojis; fair enough. They were, in order: - a US flag emoji (made up of two codepoints) - a heart-on-fire symbol (two distinct symbols combined into a single image, made up of four codepoints total) - a woman and a man with a heart between them (three distinct symbols combined into a single image, made up of six codepoints total) | ||