Remix.run Logo
thaumasiotes 4 days ago

> reST is more feature-complete and extension-friendly, but it is simply unusable for me because it wasn't designed for agglutinative languages like Korean.

How does whether you think of the language as agglutinative affect the usability of reST?

The biggest problem that occurs to me is that there isn't really a conceptual difference between an "agglutinative" language in which you have very long words expressing complex meanings, and an "isolating" language in which the same syllables occur in the same order with the same meaning but are thought of on a Platonic level as being all independent words.

This is because an "agglutinative" language is one in which syntax markers are more or less independent of any other syntax markers that may apply to the same word†, which means it's always possible by definition to consider those markers to be "words" themselves.

Would your problems be solved if you viewed what you had considered "long" Korean words as instead being several short words in a row? What difficulties does agglutination present?

† Compare: https://glossary.sil.org/term/agglutinative-language

> An agglutinative language is a language in which words are made up of a linear sequence of distinct morphemes and each component of meaning is represented by its own morpheme.

https://glossary.sil.org/term/isolating-language

> An isolating language is a language in which almost every word consists of a single morpheme.

lifthrasiir 4 days ago | parent | next [-]

> This is because an "agglutinative" language is one in which syntax markers are more or less independent of any other syntax markers that may apply to the same word†, which means it's always possible by definition to consider those markers to be "words" themselves.

I think SIL's definition is, while robust, not the usual definition because English can be regarded as agglutinative in this definition. This is particularly visible from the statement that most European languages are somewhat fusional [1], which is okay under their definitions but not the usual way we think of English.

In my understanding, the analyticity is a spectrum and highly analytic languages with most (but not necessarily all) words containing just one morpheme are said to be isolating. Words in agglutinative languages can be, but not necessarily have to be, analyzed as a main morpheme ("word") with dependent morphemes attached ("affixes"). Polysynthetic languages go further by allowing multiple main morphemes in one word. As languages tend to become synthetic (as opposed to analytic), the space-separated "word" is less useful [2] and segmentation gets harder and harder. reST's failure to support those languages is all about a bad assumption about segmentation.

[1] https://glossary.sil.org/term/fusional-language

[2] So much that several agglutinative languages---in which space-separated words can still be useful---don't even think about spacing, e.g. Japanese.

thaumasiotes 4 days ago | parent [-]

> I think SIL's definition is, while robust, not the usual definition because English can be regarded as agglutinative in this definition. This is particularly visible from the statement that most European languages are somewhat fusional, which is okay under their definitions but not the usual way we think of English.

Well, in the first place, I don't put much stock in the idea that "the usual way we think of" a language is a good way to determine the characteristics of that language. A good example here would be Finnish, which has a large number of particles that appear to be independent of the words they modify, but which are traditionally referred to as "case markers" by analogy to European languages that have case. Finnish is said to have an extraordinarily large number of cases, but that is because each Finnish preposition is called a "case".

In the second place, you can clearly see fusion in the English verb be. You can see it less clearly in other places - wikipedia's page on analytic languages calls out the third-person singular present verb ending for simultaneously encoding all three of those contrasts.

But I would say you're right in spirit that those are vestigial elements of the language. English verb structure looks very agglutinative to me; the biggest objection (which SIL's definition doesn't mention) would be that auxiliary verbs still inflect.

In particular, this:

> Words in agglutinative languages can be, but [do] not necessarily have to be, analyzed as a main morpheme ("word") with dependent morphemes attached ("affixes").

is actually the standard view of English verbs (except that the auxiliary verbs are not thought of as affixes), still taught in school, but contradicted by syntax classes that say that a dependent element shouldn't control the form of the element from which it depends. And then uncontradicted by practicing linguists who feel that we might as well follow the obvious semantic dependence.

Another objection, which I find more persuasive than "agglutinative particles shouldn't inflect", is that the meaning of a particular English word form isn't necessarily very tightly determined by the form. So in he is painting a picture, the -ing element we see on painting is fundamentally there to agree with the continuous aspect marker is, and it has other meanings in other contexts. In he likes painting pictures, the same element is there to derive a noun from the verb.

And another objection might be that the languages we call agglutinative commonly incorporate subject and object into the interior of the verb, surrounded by other affixes, which isn't done in English unless you want to count phrasal verbs. ;D

I am undisturbed by the ambiguity; you might note that I led with the observation that agglutinative languages aren't well-defined in the first place.

None of this helps to explain why there might be a conflict between Markdown and agglutination, though.

lifthrasiir 4 days ago | parent | next [-]

I'm not here for arguing against linguistic concepts, so let me cover just one thing:

> None of this helps to explain why there might be a conflict between Markdown and agglutination, though.

reST, not Markdown. (Yeah I totally get it though because I made the same mistake in the OP!) Those languages often need to highlight individual morphemes inside space-separated "words", but reST assumes space-separated "word" as a default, hence annoyance.

mattclarkdotnet 4 days ago | parent | prev [-]

It’s amazing anyone can read, speak or write such a language!

thaumasiotes 3 days ago | parent [-]

Huh?

chrismorgan 4 days ago | parent | prev | next [-]

The key here is whether there’s a word separator, not agglutinativity or isolation. The term I find for this on a brief search is scriptio continua <https://en.wikipedia.org/wiki/Scriptio_continua>.

lifthrasiir 4 days ago | parent [-]

Yeah that would be a better way to phrase my opinion. Chinese is highly isolating but doesn't use spacing due to its writing system and therefore is heavily affected by this issue.

mattclarkdotnet 4 days ago | parent | prev [-]

These are descriptive terms though? It’s not like the language actually works that way