| ▲ | simedw 14 hours ago | |
Thank you. I had a quick look at Farsi datasets, and there seem to be a few options. That said, written Farsi doesn’t include short vowels… so can you derive pronunciation from the text using rules? | ||
| ▲ | kranner 14 hours ago | parent [-] | |
> written Farsi doesn’t include short vowels… so can you derive pronunciation from the text using rules? You can't, but Farsi dictionaries list the missing short vowels/diacritics/"eraab" for every word. For instance, see this entry: https://vajehyab.com/dehkhoda/%D8%AD%D8%B3%D8%A7%D8%A8?q=%D8... With the short vowel on the first letter it would be written حِساب (normally written as just حساب) The dictionary entry linked shows that there is a ِ on the first letter ح But you would have to disambiguate between homographs that differ only in the eraab. | ||