Remix.run Logo
hbarka an hour ago

How can they be diverging, LLMs are built on similar foundations aka the Transformer architecture. Do you mean the training method (RLHF) is diverging?