| ▲ | Symbiote 6 days ago |
| Using the HN public dataset in Google BigQuery [0], which I think fits easily in the amount of free queries allowed: SELECT
EXTRACT(YEAR FROM timestamp) AS year,
SUM(CASE WHEN text LIKE '%—%' THEN 1 ELSE 0 END) AS withDash,
COUNT(*) AS total,
SUM(CASE WHEN text LIKE '%—%' THEN 1 ELSE 0 END) / COUNT(*) AS fraction
FROM `bigquery-public-data.hacker_news.full`
WHERE type = 'comment'
GROUP BY year
ORDER BY year;
year with— total frac
2006 0 12 0.000
2007 13 70858 0.000
2008 461 247922 0.001
2009 1497 491034 0.003
2010 3835 842438 0.005
2011 4719 1044913 0.005
2012 5648 1246782 0.005
2013 7881 1665185 0.005
2014 8400 1510814 0.006
2015 9967 1642912 0.006
2016 12081 2093612 0.006
2017 14530 2361709 0.006
2018 19246 2384086 0.008
2019 23662 2755063 0.009
2020 27316 3243173 0.008
2021 32863 3765921 0.009
2022 34657 4062159 0.009
2023 36611 4221940 0.009
2024 32543 3339861 0.010
2025 30608 2231919 0.014
So there's definitely been an increase.Querying for the users who use "—" most as a proportion of all their comments: SELECT
`by`,
SUM(CASE WHEN text LIKE '%—%' THEN 1 ELSE 0 END) / COUNT(*) AS fraction,
COUNT(*) AS total,
MIN(timestamp) AS minTime,
MAX(timestamp) AS maxTime
FROM `bigquery-public-data.hacker_news.full`
WHERE
type = 'comment' AND
timestamp < '2022-11-30'
GROUP BY `by`
HAVING COUNT(*) > 100
ORDER BY fraction DESC
LIMIT 250;
zmgsabst uses them the most [1], westoncb [2] is an older account that uses them fourth-most.[0] https://console.cloud.google.com/marketplace/product/y-combi... [1] https://news.ycombinator.com/threads?id=zmgsabst [2] https://news.ycombinator.com/threads?id=westoncb |
|
| ▲ | data-ottawa 5 hours ago | parent | next [-] |
| Worth noting in 2025 we’ve started talking about em dash as AI You could probably remove any comment with the word “em” in it (we can assume comments on em height in css have the same em dash frequency) |
|
| ▲ | hithereagain 6 days ago | parent | prev | next [-] |
| Older people, say folks in their forties or older, grew up with the em dash. |
| |
| ▲ | JdeBP 6 days ago | parent | next [-] | | That's backwards. People in that age bracket grew up with computers where the em dash was not in the character set at all, and typewriters and terminals only had a minus key. The people who grew up with the em dash are the younger HTML generation of 30 years ago where — was at least a reasonably convenient character entity even if they were using computers with the various 8-bit character sets that did not contain it. | | |
| ▲ | jml78 6 days ago | parent | next [-] | | Correct, I am 46, grew up with BBS. Early internet. I will be honest, never knew the name of em dash until it became a GPT thing. | | |
| ▲ | JdeBP 6 days ago | parent | next [-] | | ... meaning that you have read some posts on this page a certain way. (-:
--- IM2000
* Origin: Some WWW site named Hacker News (2:257/609.3)
| |
| ▲ | YVoyiatzis 6 days ago | parent | prev [-] | | # Dash Usage Guide *Hyphen (-)* = word-joiner *En dash (–)* = “to/between” *Em dash (—)* = pause, punch, drama |
| |
| ▲ | reaperducer 6 days ago | parent | prev | next [-] | | That's backwards. People in that age bracket grew up with computers where the em dash was not in the character set at all, and typewriters and terminals only had a minus key. I guess you weren't there. We did em-dashes on typewriters. We just turned the platen knob down one click, typed _, and turned it back. | | |
| ▲ | npsomaratna 6 days ago | parent | next [-] | | Anecdotally, what I've seen is that folks who learned typing in the 80s and earlier use two dashes '--' instead of the em-dash (although modern word processors seem to replace this combination with the em-dash). Something else I've noticed is their tendency to use two blank spaces between sentences. I'm a self-taught typist, with all the quirks that comes with (can type programming stuff very accurately at a 100+ WPM; can type normal stuff at a high WPM as well, but the error rate goes up). | | |
| ▲ | Breza 7 hours ago | parent [-] | | I learned to type in the nineties and I used two hyphens. I also learned to put two spaces between sentences but dropped that in the oughts. |
| |
| ▲ | ted_dunning 6 days ago | parent | prev [-] | | None of us at our house did that. | | |
| ▲ | reaperducer 5 days ago | parent [-] | | That doesn't mean it didn't happen. Your house is not the only house. Moreover, your home is not representative of the millions of typewriters in businesses around the world. |
|
| |
| ▲ | JKCalhoun 6 days ago | parent | prev [-] | | True, but when desktop publishing arrived on the Mac, I embraced it. | | |
| |
| ▲ | jnwatson 6 days ago | parent | prev [-] | | Older people that grew up with "desktop publishing" and "The Mac is not a Typewriter" grew up with the em dash. | | |
|
|
| ▲ | LeoPanthera 6 days ago | parent | prev [-] |
| I took a peak at zmgsabst's comments, but they use them with spaces around the dash — like this. ChatGPT always uses them without spaces—like this. |
| |
| ▲ | Symbiote 6 days ago | parent | next [-] | | Changing the filter to text LIKE '%—%' AND text NOT LIKE '% —%' AND text NOT LIKE '%— %'
puts westoncb in the lead, followed by mucholove, trebbble, _zzaw and lexcorvus. | | |
| ▲ | westoncb 6 days ago | parent [-] | | I actually tweeted like a month ago that I was the reason LLMs use em dashes so much lol: https://x.com/Westoncb/status/1961802304698671407 | | |
| ▲ | JdeBP 6 days ago | parent [-] | | There are quite a few —es on my WWW site and on StackExchange thanks to me; and I vaguely recall that I might even have written one on Wikipedia once. But I am quite happy for you to take the blame for training the LLMs. (-: | | |
| ▲ | westoncb 5 days ago | parent [-] | | lol no problem. In reality though there's kind of a funny story behind it because I suspect the way I ended up using them so much is similar to how ChatGPT did. When I got into writing I studied grammar, then decided to read a bunch of classics and analyze their usage of punctuation in general until I had a good understanding of every bit of it. Then, in order to practice, I'd apply what I learned to anything I was writing at the time whether journal notes, conversations on AIM/IRC etc. That latter step meant I was translating a lot of casual/natural speech into a form that also had a high level of 'correctness'. And if you faithfully translate natural speech into 'correct'ly punctuated sentences, you end up using a lot of em dashes. Because ChatGPT/LLMs are tuned for natural/authentic style, as well as for a high degree of 'correctness,' you get today's state of affairs. Just a theory. |
|
|
| |
| ▲ | Rumudiez 6 days ago | parent | prev | next [-] | | The rule is spaces on both sides of an en dash – like so – or an em dash without any spaces—like this. Important to note the US keyboard layout does not have either of these or the minus glyph, just the hyphen, and it’s unadvisable to mix multiple styles | |
| ▲ | eMPee584 6 days ago | parent | prev | next [-] | | & it looks awful without spaces — imho | | |
| ▲ | JKCalhoun 6 days ago | parent | next [-] | | Which is what I do (add a space before and after). I didn't know you weren't supposed to put the spaces until someone pointed it out to me — suggested I was not an LLM because I added the spaces. Makes me wonder if kerning is done correctly, if the em-dash would look like there were spaces before and after when there were not. | | | |
| ▲ | perilunar 4 days ago | parent | prev | next [-] | | You can also use an em-dash with thin spaces (U+2009) or hair spaces (U+200A), but it doesn't work on HN—they just display as regular spaces. | |
| ▲ | colanderman 6 days ago | parent | prev [-] | | The common guidance I've seen is en dash with spaces, em dash without. |
| |
| ▲ | indigodaddy 6 days ago | parent | prev [-] | | I always thought the proper usage was no space before but one space after-- like this. | | |
| ▲ | wizzwizz4 6 days ago | parent [-] | | There's no "proper usage" for any feature of English: it's all by consensus. However, I have seen that in published books from the 1900s. |
|
|