| ▲ | Someone1234 7 hours ago |
| They cannot. Unfortunately many believe they can, and it is impossible to disprove. So now real people need to write avoiding certain styles, because a lot of other people have decided those are "LLM clues." Bullets, EM Dash, certain common English phases or words (e.g. Delve, Vibrant, Additionally, etc)[0]. Basicaly you need to sprinkle subtle mistakes, or lower the quality of your written communications to avoid accusations that will side-track whatever youre writing into a "you're a witch" argument. Ironically LLM accusations are now a sign of the high quality written word. [0] https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing |
|
| ▲ | alex43578 6 hours ago | parent | next [-] |
| Someone with native fluency in American English can (should) be able to tell the difference between human writing and unpolished AI copy-paste. Essentially 0 people use emoji to create a bulleted list. Nobody unintentionally cites fake legal precedents or non-existent events, articles, or papers. Even the “it’s not X, it’s Y” structure, in the presence of other suspicious style/tone cues signals LLM text. |
| |
| ▲ | prmph 6 hours ago | parent | next [-] | | Also one big tell that is hard to hide is making verbose lists with fluff but little actual informative content. Ask an LLM to read your project specs and add a section headed: Performance Optimizations, to see an example of this Another is a certain punchy and sensationalist style that does not change throughout a longer piece of writing. | | |
| ▲ | alex43578 6 hours ago | parent | next [-] | | One of my subtle favorites is the
“H2 Heading with: Colorful Description” Eg - The Strait of Hormuz: Chokepoint or Opportunity? | | |
| ▲ | Filligree 6 hours ago | parent [-] | | I’ve used titles like that for thirty years. | | |
| ▲ | lelanthran 6 hours ago | parent | next [-] | | I'm going to ask the qustion I ask everyone who makes the claim that they wrote like that for years: Can you show us a link from prior 2022 that you wrote like that? | | |
| ▲ | Filligree 5 hours ago | parent | next [-] | | No, of course not. It’s all corporate internal documentation. I suppose my high school essays were not. Apologies, but those are lost. | |
| ▲ | joquarky 3 hours ago | parent | prev [-] | | Nobody owes you evidence for your witch hunts. | | |
| ▲ | lelanthran 2 hours ago | parent [-] | | Sure, but, look, we have seen these claims so many times, that if it were true by now someone would have linked at least one archived blog post to show that it is, indeed, how humans used to write. The lack of a single example is very telling. |
|
| |
| ▲ | fwip 6 hours ago | parent | prev [-] | | Sure, and an LLM-written article will use that pattern eight times in two pages. |
|
| |
| ▲ | roncesvalles 6 hours ago | parent | prev [-] | | Exactly, it's the monotony of the style that gives it away. |
| |
| ▲ | jcims 6 hours ago | parent | prev | next [-] | | >Even the “it’s not X, it’s Y” structure I wonder where some of this comes from. Another one is 'real unlock', it's not a common phrasing that I really recall. https://trends.google.com/explore?q=real%2520unlock&date=all... | |
| ▲ | derwiki 6 hours ago | parent | prev | next [-] | | Emojis for lists: completely agree with you, but presumably this was learned in training? | | |
| ▲ | alex43578 6 hours ago | parent [-] | | I think that’s a RLHF issue - if you ask people “which looks better”, they too-frequently picked the emoji list. Same with the overuse of bolding. I think it’s also why the more consumer-facing models are so fawning: people like to be praised. |
| |
| ▲ | EagnaIonat 6 hours ago | parent | prev | next [-] | | > 0 people use emoji to create a bulleted list. I haven't seen this yet, but I guess the only reason I haven't done it is because it never crossed my mind. What I have found an easy detection is non-breaking spaces. They tend to get littered through the passages of text without reason. | |
| ▲ | peter-m80 an hour ago | parent | prev | next [-] | | I do use bullets and emojis | |
| ▲ | fleebee 6 hours ago | parent | prev | next [-] | | I think the trope in this comment[0] from another thread is the most obvious tell, perhaps even more than "not x, but y". > It’s the fake drama. Punchy sentences. Contrast. And then? A banal payoff. It's great because it's a double-decker of annoying marketing copy style and nonsensical content. [0]: https://news.ycombinator.com/item?id=47615075 | |
| ▲ | bjourne 4 hours ago | parent | prev [-] | | [dead] |
|
|
| ▲ | mulr00ney 6 hours ago | parent | prev | next [-] |
| > Unfortunately many believe they can, and it is impossible to disprove. So now real people need to write avoiding certain styles, because a lot of other people have decided those are "LLM clues." Bullets, EM Dash, certain common English phases or words (e.g. Delve, Vibrant, Additionally, etc)[0]. I think people will be able to detect the lowest-user-effort version of LLM text pretty reliably after a while (ie what you describe; many people have a good sense of LLM clues). But there's probably a *ton* of LLM text out there where some of the instructions given were "throw a few errors in", "don't use bullet points or em dashes", "don't do the `it's not this, it's that` thing" going undetected. And then those changes will get built into ChatGPT's main instructions, and in a few months people will start to pick up on other indicators, and then slightly smarter/more motivated users will give new instructions to hide their LLM usage... (or everyone stops caring, which is an outcome I find hard to wrap my head around) |
|
| ▲ | sheepscreek 6 hours ago | parent | prev | next [-] |
| This is the correct answer. We’re at a point where it will soon be safer to assume a human or someone with agency and their approval wrote the text, than to completely dismiss it as “written by LLM” or a human. So judge the content on its merit irrespective of its source. |
|
| ▲ | loloquwowndueo 7 hours ago | parent | prev | next [-] |
| The key insight is to avoid – em dashes. You’re absolutely right. It’s not the content, it’s the style. |
| |
| ▲ | sanex 6 hours ago | parent | next [-] | | Ironically one of the big tells for me is the "It's not this. It's that."
Your comment uses a comma though so you're probably a real person :) | | |
| ▲ | rcxdude 6 hours ago | parent | next [-] | | I assume they were aping those terms ironically (especially given the 'you're absolutely right') | |
| ▲ | loloquwowndueo 6 hours ago | parent | prev [-] | | Busted!!!! Staccato (too may short sentences with periods) is also a telltale for me. Most humans prefer longer sentences with more varied punctuation; I, for example, am a sucker for run-on sentences. |
| |
| ▲ | LoganDark 6 hours ago | parent | prev [-] | | That's an en-dash. | | |
| ▲ | sumeno 6 hours ago | parent | next [-] | | You're absolutely right! I unintentionally used an en-dash instead of an em-dash. Here is the em-dash you requested: – | |
| ▲ | loloquwowndueo 6 hours ago | parent | prev [-] | | Sorry! Is this ok? — | | |
|
|
|
| ▲ | fortran77 6 hours ago | parent | prev | next [-] |
| And I'm sure we've all seen what happens if you run the Declaration of Independence or the Gettysburg Address or the book of Genesis through an AI "detector". They usually come back as AI. |
| |
| ▲ | spindump8930 6 hours ago | parent [-] | | Only for poor quality systems. Unfortunately there are many systems that tried to make easy hype, but are the equivalent of an ML 101 classifier class project. If one measures for perplexity (how likely text is under a certain language model), common text in a training set will be very likely. But you can easily create better models. |
|
|
| ▲ | Joel_Mckay 6 hours ago | parent | prev | next [-] |
| Indeed, isomorphic plagiarism by its nature forms strong vector search paths that were made from stealing both global websites, real peoples work, and LLM user-base input/markdown. However, reasoning models adding a random typo to seem less automated, still do not hide the fairly repeatable quantized artifacts from the training process. For LLM, it is rather trivial to find where people originally scraped the data from if they still have annotated training metadata. Finally, reading LLM output is usually clear once one abandons the trap of thinking "I think the author meant [this/that]", and recognizing a works tone reads like a fake author had a stroke [0]. =3 [0] https://en.wikipedia.org/wiki/Stroke |
|
| ▲ | lelanthran 6 hours ago | parent | prev [-] |
| > Ironically LLM accusations are now a sign of the high quality written word. Citation needed. The LLM accusations come from the specific cadence they use. You can remove all em-dashes from a piece of text and it still becomes clear when something is LLM written. Can they be prompted to be less obvious? Sure, but hardly anyone does that. It's more "The Core Insight", "The Key Takeaway", etc. than it is about emdashes. Incidentally, the only people annoyed about "witch-hunts" tend to be those who are unable to recognise cadence in the written word. |
| |
| ▲ | order-matters 5 hours ago | parent [-] | | i think another part of the problem is that some people are using AI so much that they are starting to mimic its cadence in their own writing. they may have had a prior coincidental predisposition for writing somewhat similar to AI with worse grammar, and now are inching towards alignment as they either intentionally or accidentally use AI output as a model to improve their writing |
|