Remix.run Logo
ninja3925 3 hours ago

Out of curiosity, how was it discovered? You would have to look for it to find this linear combination.

jdiff 2 hours ago | parent | next [-]

Without the system prompt, asking its name results in it responding with the name of the model they're ripping from. That would certainly draw your eyes to the right places.

valleyer 2 hours ago | parent [-]

Why is this? Do labs reinforce the model name during training? I was under the impression that this sort of "self-knowledge" always came from the system prompt, but I guess not...

jdiff an hour ago | parent [-]

Yes. In this case, during fine tuning. Other blurbs are also baked in during fine tuning that are perfectly reproducible from the Nex model. The details inside the linked issue are quite accessible.

Aurornis 3 hours ago | parent | prev [-]

Check the linked GitHub issue. They explain their process.

Scroll past the first issue to find it. It’s further down.