Remix.run Logo
furyofantares a day ago

> It isn’t the actual thinking that drove the model’s actions in a session- but a summary of the thinking logic. This is like using saving a jpeg as a .bmp and then editing the .bmp and presenting it as a .jpeg. The conversion produces data loss.

You've got that backwards, .bmp is a lossless format and .jpeg is the lossy one.

0o_MrPatrick_o0 a day ago | parent [-]

My bad! 10 points for House Slytherin!

altmanaltman a day ago | parent [-]

also a typo in the last sentence you're vrs your

glaslong a day ago | parent | next [-]

Weirdly pleasant, if minor, signal of human authorship

Tomte a day ago | parent | next [-]

In a parallel universe LLMs have learned that (a) the training material contains many different orthographic errors and (b) that humans follow a non-obvious pattern when "deciding" which error to make, so that their generated output contains such errors, as well.

In our universe LLMs seem to have learned that those errors do not follow patterns in the aggregate and that they should not be emulated.

tekne a day ago | parent [-]

The raw pretrained models make the errors, I believe -- we then reinforcement-learn them out.

Tomte a day ago | parent [-]

That‘s interesting! Do you have a paper or blog post or so at hand that shows examples of raw and RL‘ed output?

Silagi a day ago | parent | prev | next [-]

I'm convinced this "signal" has already been hijacked. Maybe a Baader-Meinhof phenomenon, but I've noticed more and more egregious spelling errors that make little sense from a human perspective. Hop into whatever chatbot you'd like and ask it to "write a paragraph with subtle misspellings on long but common words", and you'll notice misspellings that just feel wrong, because they don't map to a clear misunderstanding that a person could have.

Or maybe I'm losing it after reading too much slop. Also distinctly possible.

glaslong a day ago | parent | next [-]

Nah I think you're probably right. I would guess that anyone actually paying attention to trying to make their slop sound human has easily instructed their skills to avoid some tells / inject others.

It's the general (lazy) usage of default model outputs that are still too clean.

It's pretty trivial to ask Haiku to "add cool kid no-caps and occasionally mix up 'their/there/they're' for authenticity"

FireBeyond a day ago | parent | prev [-]

About a month ago, I noticed that Claude decided I wanted my responses in UK English, not American. It couldn't explain why, but offered to note that in its directions. (Great, process tokens constantly to do what should be configurable from a dialog dropdown).

genxy a day ago | parent | prev | next [-]

Not for long!

altmanaltman a day ago | parent | prev [-]

Yeah, definitely it's a nice thing in today's context, weirdly. But also, you shouldn't really be making typos if you're writing an article and are using a basic spellcheck.

The text is clearly human-written just because it doesn't smell like AI (in this case, even if it was written by AI and produced this particular output, that's okay imo). I deal a lot with AI writing and writing in general, as I worked as an editor in another life so it's natural to me to see writing and form an objective opinion on it.

0o_MrPatrick_o0 a day ago | parent | prev [-]

I missed my coffee! Ty! Five points to Slytherin.

altmanaltman a day ago | parent [-]

wait till my father hears about this!