▲ | Cynddl 5 days ago | |
Hi, author here, this is exactly what we tested in our article: > Third, we show that fine-tuning for warmth specifically, rather than fine-tuning in general, is the key source of reliability drops. We fine-tuned a subset of two models (Qwen-32B and Llama-70B) on identical conversational data and hyperparameters but with LLM responses transformed to be have a cold style (direct, concise, emotionally neutral) rather than a warm one [36]. Figure 5 shows that cold models performed nearly as well as or better than their original counterparts (ranging from a 3 pp increase in errors to a 13 pp decrease), and had consistently lower error rates than warm models under all conditions (with statistically significant differences in around 90% of evaluation conditions after correcting for multiple comparisons, p<0.001). Cold fine-tuning producing no changes in reliability suggests that reliability drops specifically stem from warmth transformation, ruling out training process and data confounds. |