Remix.run Logo
jiggawatts 7 hours ago

I’m convinced that language sharing can be encouraged during training by rewarding correct answers to questions that can only be answered based on synthetic data in another language fed in during a previous pretraining phase.

Interleave a few phases like that and you’d force the model to share abstract information across all languages, not just for the synthetic data but all input data.

I wouldn’t be surprised if this improved LLM performance by another “notch” all by itself, especially for non-English users.

nenaoki 4 hours ago | parent [-]

your shrewd idea might make a fine layer back up the Tower of Babel