| ▲ | capnrefsmmat 10 hours ago | |||||||
I work on research studying LLM writing styles, so I am going to have to steal this. I've seen plenty of lists of LLM style features, but this is the first one I noticed that mentions "tapestry", which we found is GPT-4o's second-most-overused word (after "camaraderie", for some reason).[1] We used a set of grammatical features in our initial style comparisons (like present participles, which GPT-4o loved so much that they were a pretty accurate classifier on their own), but it shouldn't be too hard to pattern-match some of these other features and quantify them. If anyone who works on LLMs is reading, a question: When we've tried base models (no instruction tuning/RLHF, just text completion), they show far fewer stylistic anomalies like this. So it's not that the training data is weird. It's something in instruction-tuning that's doing it. Do you ask the human raters to evaluate style? Is there a rubric? Why is the instruction tuning pushing such a noticeable style shift? [1] https://www.pnas.org/doi/10.1073/pnas.2422455122, preprint at https://arxiv.org/abs/2410.16107. Working on extending this to more recent models and other grammatical features now | ||||||||
| ▲ | djoldman 9 hours ago | parent | next [-] | |||||||
The RLHF is what creates these anomalies. See delve from kenya and nigeria. Interestingly, because perplexity is the optimization objective, the pretrained models should reflect the least surprising outputs of all. | ||||||||
| ||||||||
| ▲ | grey-area 29 minutes ago | parent | prev | next [-] | |||||||
I wonder if th style shift has anything to do with training for conversation (ie. tuning models to respond well in a chat situation)? | ||||||||
| ▲ | networked 10 hours ago | parent | prev | next [-] | |||||||
You may be interested in my links on AI's writing style: https://dbohdan.com/ai-writing-style. I've just added your preprint and tropes.fyi. It has "hydrogen jukeboxes: on the crammed poetics of 'creative writing' LLMs" by nostalgebraist (https://www.tumblr.com/nostalgebraist/778041178124926976/hyd...), which features an example with "tapestry". > Why is the instruction tuning pushing such a noticeable style shift? Gwern Branwen has been covering this: https://gwern.net/doc/reinforcement-learning/preference-lear.... | ||||||||
| ▲ | red_hare 3 hours ago | parent | prev | next [-] | |||||||
I wonder if it has to do with how meaning is tied to the tokens. c+amara+derie (using the official gpt-5 tokenizer). There's also just that weird thing where they're obsessed with emoji which I've always assumed is because they're the only logograms in english and therefore have a lot of weight per byte. | ||||||||
| ||||||||
| ▲ | albert_e 7 hours ago | parent | prev [-] | |||||||
There is an organization named Tapestry (parent of Coach Inc). Wonder how they can avoid the trop while not censoring themselves out. | ||||||||