Remix.run Logo
ak681443 7 days ago

Isn't this just control vectors rediscovered?

https://www.lesswrong.com/posts/Bf3ryxiM6Gff2zamw/control-ve...

CephalopodMD 7 days ago | parent | next [-]

The added sauce here is they're using it to bias the model during training, not just using steering vectors at inference time (though they do mention that). This is apparently effective at making the intended change in behavior without the lobotomizing side effects that steering vectors can have.

benreesman 7 days ago | parent | prev | next [-]

I've been referring to apparently this as "whatever a control vector is called in 2025" since they started doing it to dilute tokens under load: https://news.ycombinator.com/item?id=44082733

supriyo-biswas 7 days ago | parent | prev [-]

Thank you for linking to that article; it makes it clear as to what one would need to do to calculate control vectors.