| ▲ | kingstnap a day ago | |
Thats strange. Now it's possible to just copy paste weights and blocks into random places in a neural network and have it work (frankenmerging is a dark art). And you can do really aggressive model distillation using raw logits. But my guess is this seems more like maybe they all source some similar safety tuning dataset or something? There are these public datasets out there (varying degrees of garbage) that can be used to fine tune for safety. For example anthropics stuff: https://huggingface.co/datasets/Anthropic/hh-rlhf | ||