Remix.run Logo
lapcat 8 hours ago

You’ve got it backwards: LLMs were trained on human writing and appropriated our style.

have_faith 8 hours ago | parent [-]

Partially true. They've been trained and then aligned towards a preferred style. They don't use em-dashes because they are over-represented in the training material (majority of people don't use them).

lapcat 7 hours ago | parent [-]

It seems likely that with the written word, as with most things, a minority of people produce the majority of content. Most people publish relatively few words compared to professional writers.

Possibly the LLM vendors could bias the models more toward nonprofessional content, but then the quality and utility of the output would suffer. Skip the scientific articles and books, focus on rando internet comments, and you’ll end up with a lot more crap than you already get.