| ▲ | driese an hour ago | |
For the same reason that a human who is fluent in five languages can probably express themselves better in either one compared to human that only speaks one, while also having a more nuanced understanding of general grammar. From what I know, learning on a more diverse set makes a model better overall. | ||
| ▲ | amelius 8 minutes ago | parent [-] | |
This might be an interesting research question: can you train a model on many languages, and then extract a much smaller model that knows only one language without much loss of quality? | ||