Remix.run Logo
simonw 4 hours ago

I'd like to see some concrete examples that illustrate this - as it stands this feels like an opinion piece that doesn't attempt to back up its claims.

(Not necessarily disagreeing with those claims, but I'd like to see a more robust exploration of them.)

barrkel 4 hours ago | parent | next [-]

Have you not seen it any time you put any substantial bit of your own writing through an LLM, for advice?

I disagree pretty strongly with most of what an LLM suggests by way of rewriting. They're absolutely appalling writers. If you're looking for something beyond corporate safespeak or stylistic pastiche, they drain the blood out of everything.

The skin of their prose lacks the luminous translucency, the subsurface scattering, that separates the dead from the living.

simonw 2 hours ago | parent | next [-]

The prompt I use for proof-reading has worked great for me so far:

  You are a proof reader for posts
  about to be published.

  1. Identify for spelling mistakes
  and typos
  2. Identify grammar mistakes
  3. Watch out for repeated terms like
  "It was interesting that X, and it
  was interesting that Y"
  4. Spot any logical errors or
  factual mistakes
  5. Highlight weak arguments that
  could be strengthened
  6. Make sure there are no empty or
  placeholder links
matwood 3 hours ago | parent | prev | next [-]

> If you're looking for something beyond corporate safespeak

AI has been great for removing this stress. "Tell Joe no f'n way" in a professional tone and I can move on with my day.

dsf2d 2 hours ago | parent [-]

Yeah but does it make sense to have invested all this money for this?

Lol no. Might be great for you as a consumer who is using these products for free. But expand the picture more.

matwood an hour ago | parent [-]

> Yeah but does it make sense to have invested all this money for this?

No, but it's here. Why wouldn't I use it?

Terretta 2 hours ago | parent | prev [-]

> If you're looking for something beyond corporate safespeak or stylistic pastiche, they drain the blood out of everything.

Strong agree, which is why I disagree with this OP point:

“Stage 2: Lexical flattening. Domain-specific jargon and high-precision technical terms are sacrificed for "accessibility." The model performs a statistical substitution, replacing a 1-of-10,000 token with a 1-of-100 synonym, effectively diluting the semantic density and specific gravity of the argument.”

I see enough jargon in everyday business email that in the office zero-shot LLM unspoolings can feel refreshing.

I have "avoid jargon and buzzwords" as one of very tiny tuners in my LLM prefs. I've found LLMs can shed corporate safespeak, or even add a touch of sparkle back to a corporate memo.

Otherwise very bright writers have been "polished" to remove all interestingness by pre-LLM corporate homogenization. Give them a prompt to yell at them for using 1-in-10 words instead of 1-in-10,000 "perplexity" and they can tune themselves back to conveying more with the same word count. Results… scintillate.

furyofantares 3 hours ago | parent | prev | next [-]

Look through my comment history at all the posts where I complain the author might have had something interesting to say but it's been erased by the LLM and you can no longer tell what the author cared about because the entire post is a an oversold monotone advertising voice.

https://news.ycombinator.com/item?id=46583410#46584336

https://news.ycombinator.com/item?id=46605716#46609480

https://news.ycombinator.com/item?id=46617456#46619136

https://news.ycombinator.com/item?id=46658345#46662218

https://news.ycombinator.com/item?id=46630869#46663276

https://news.ycombinator.com/item?id=46656759#46663322

https://news.ycombinator.com/item?id=46661936#46663362

https://news.ycombinator.com/item?id=46748077#46749699

internet_points 37 minutes ago | parent | prev | next [-]

I just sent TFA to a colleague of mine who was experimenting with llm's for auto-correcting human-written text, since she noticed the same phenomenon where it would correct not only mistakes, but slightly nudge words towards more common synonyms. It would often lose important nuances, so "shun" would be corrected to "avoid", and "divulge" would become "disclose" etc.

gdulli 3 hours ago | parent | prev | next [-]

Kaffee: Corporal, would you turn to the page in this book that says where the mess hall is, please?

Cpl. Barnes: Well, Lt. Kaffee, that's not in the book, sir.

Kaffee: You mean to say in all your time at Gitmo, you've never had a meal?

Cpl. Barnes: No, sir. Three squares a day, sir.

Kaffee: I don't understand. How did you know where the mess hall was if it's not in this book?

Cpl. Barnes: Well, I guess I just followed the crowd at chow time, sir.

Kaffee: No more questions.

NitpickLawyer 4 hours ago | parent | prev [-]

It is an opinion piece. By a dude working as a "Professor of Pharmaceutical Technology and Biomaterials at the University of Ferrara".

It has all the tropes of not understanding the underlying mechanisms, but repeating the common tropes. Quite ironic, considering what the author's intended "message" is. Jpeg -> jpeg -> jpeg bad. So llm -> llm -> llm must be bad, right?

It reminds me of the media reception of that paper on model collapse. "Training on llm generated data leads to collapse". That was in 23 or 24? Yet we're not seeing any collapse, despite models being trained mainly on synthetic data for the past 2 years. That's not how any of it works. Yet everyone has an opinion on how bad it works. Jesus.

It's insane how these kinds of opinion pieces get so upvoted here, while worth-while research, cool positive examples and so on linger in new with one or two upvotes. This has ceased to be a technical subject, and has moved to muh identity.

simonw 4 hours ago | parent | next [-]

Yeah, reading the other comments on this thread this is a classic example of that Hacker News (and online forums in general) thing where people jump on the chance to talk about a topic driven purely by the headline without engaging with the actual content.

(I'm frequently guilty of that too.)

ghywertelling 4 hours ago | parent [-]

Even if that isn't the case, isn't it the fact the AI labs don't want their models to be edgy in any creative way, choose a middle way (buddhism) so to speak. Are there AI labs who are training their models to be maximally creative?

PurpleRamen 3 hours ago | parent | prev [-]

> Yet we're not seeing any collapse, despite models being trained mainly on synthetic data for the past 2 years.

Maybe because researchers learned from the paper to avoid the collapse? Just awareness alone often helps to sidestep a problem.

NitpickLawyer 3 hours ago | parent [-]

No one did what the paper actually proposed. It was a nothing burger in the industry. Yet it was insanely popular on social media.

Same with the "llms don't reason" from "Apple" (two interns working at Apple, but anyway). The media went nuts over it, even though it was littered with implementation mistakes and not worth the paper it was(n't) printed on.

dsf2d 2 hours ago | parent [-]

Who cares? This is a place where you should be putting forth your own perspective based on your own experience. Not parotting what someone else already wrote.