Remix.run Logo
jahsome 3 days ago

How fascinating: in a paragraph entirely _about_ language, written entirely in my first language, I can barely recognize a fair chunk of the terms.

mkw5053 3 days ago | parent [-]

Ha, fair point! Let me try again :)

People have tried using modern AI to crack lost languages. In some cases it works a bit. For example, a model learned to match Ugaritic (an ancient Semitic language) to Hebrew with no “dictionary” at all. In another case, a system called Pythia can guess missing letters in damaged Greek inscriptions with higher accuracy than human experts.

But with truly lost scripts like Linear A or Proto-Elamite, we run into two problems: there are only a few hundred very short texts, and we don’t have any bilingual “Rosetta Stone” to anchor them. AI can spot patterns, cluster symbols, and even suggest likely word boundaries, but it cannot yet produce actual translations. The current hope is to combine image recognition (to clean up messy symbols) with language models guided by rules of possible sound systems, then loop in human experts to check or reject the guesses.