▲ | jiggawatts 7 hours ago | |
I’m convinced that language sharing can be encouraged during training by rewarding correct answers to questions that can only be answered based on synthetic data in another language fed in during a previous pretraining phase. Interleave a few phases like that and you’d force the model to share abstract information across all languages, not just for the synthetic data but all input data. I wouldn’t be surprised if this improved LLM performance by another “notch” all by itself, especially for non-English users. | ||
▲ | nenaoki 4 hours ago | parent [-] | |
your shrewd idea might make a fine layer back up the Tower of Babel |