Remix.run Logo
zelphirkalt 6 hours ago

All this complication seems to stem from the simple fact, that the fences don't have a recognizably distinct start and end marker. It's all "`" or "~", instead of one symbol at the start and another, different symbol at the end. And then going into the different numbers of backticks or tildes. Why add such ambiguity, that will only make it harder to parse things correctly? This immediately raises the question: "What if I start a block with 4 backticks and end it with 5?"

All these complications would have been avoidable with a more thought through design/better choices of symbols. For example one could have used brackets:

    [[[lang
    code here
    ]]]
And if one wanted to nest it, it should automatically work:

    [[[html
    html code
    [[[css
    css code
    ]]]
    [[[js
    js code
    ]]]
    html code
    ]]]
In case one wants to output literally "[[[" one could escape it using backslash, as usual in many languages.

In a parser that would be much simpler to parse. It is kind of like parsing S-expressions. There is no need for 4 backticks, 5, or any higher number. I don't want to sit there counting backticks in the document, to know what part of a nested code block some code belongs to. It's a silly design.

charles_f 4 hours ago | parent | next [-]

Your solution for the problem described here is to escape with a different character. MD's is to add more special characters. Both are valid and exist in other languages, I wouldn't qualify one as better thought than the other - though since we're talking about text that I don't want modified, if I prefer adding ticks rather than going into the text and escaping them one by one.

The complication doesn't stem from lack of distinct start and end, what you are trying to solve for here, is when you have multiple languages in a single block, and want pretty colors on each. Seeing that HTML doesn't support imbrication of pre tags (or rather doesn't render one embedded in the next), that would probably not work without producing something that is not pure html.

> In a parser that would be much simpler to parse

Parsing a variable number of ` is not more complex than looking ahead for a closing boundary. In fact, once you introduce escaping characters, you need to handle escaping of the escaping character, which is slightly more complex.

zelphirkalt 4 hours ago | parent [-]

The syntax highlighting of the code of each language itself is not the problem. This post is about markdown. A typical markdown parser doesn't do syntax highlighting for code blocks. That's usually done by some other library, like for example pygments. The issue is about markdown syntax. What happens on another language's level does not concern the markdown parser.

charles_f 4 hours ago | parent [-]

That's exactly my point, the solution you're discussing is about something else, and not relevant to what's discussed in this post.

zelphirkalt an hour ago | parent | next [-]

The solution I describe merely serves for being an easier to parse way of nesting code blocks. I don't mean it to serve for any syntax highlighting, as I am understanding is your impression. That would only be an outcome for tools that act upon the AST generated by the parser. Tools that can take code of a programming language and color it. Not the job of a markdown parser, for which my idea is meant.

_ache_ 4 hours ago | parent | prev [-]

So if syntax highlighting isn't a problem. The standard way of presenting block of code in Markdown is to indent it.

Which is quick and easy to understand.

armchairhacker 5 hours ago | parent | prev | next [-]

> In case one wants to output literally "[[[" one could escape it using backslash, as usual in many languages.

Sometimes you want to paste a large region of code into a code block, and escaping the content is harder than fixing and start and end delimiters. This matters particularly in Markdown, where embedding large regions of code or text is common, whereas other languages you’d put it in its own file.

So I still suggest the ability to change the number of open and close brackets. Then you’ll also need an implicit newline or other way to distinguish content that starts with an open bracket.

embedding-shape 5 hours ago | parent | prev | next [-]

Indeed! Last time I dealt with this exact problem in a toy application made for myself, I ended up making the markdown parser only read ```$LANG syntax, and making it assume just ``` is a closing tag, not accepting it as a opening tag. Made it easier for the pretty syntax formatter to do it's job too, as it no longer has to figure out the language.

_ache_ 4 hours ago | parent | prev [-]

Do you realize that your solution is basically to use a tag, which is why Markdown have been developed, to not use them.

The classic way in markdown to insert block of code is to indent the code.

zahlman 38 minutes ago | parent | next [-]

The point of avoiding tags is to improve the ergonomics: you don't have to remember tag names, use a separate delimiting syntax anyway to indicate where the tag name is, and then repeat the tag name when you close the block. Especially given that this is for a block-level construct anyway, simply using a bracketing syntax isn't causing any of those problems.

Indenting inline code requires a text editor that makes indentation ergonomic or else extra effort per line; and it doesn't mesh well with lists or block quotes.

zelphirkalt an hour ago | parent | prev [-]

Well, if you want complex things like nested code blocks, then a kind of "tag" approach can be just the solution needed. Input-wise it doesn't really make a difference, whether I have to type "[[[" and "]]]" or "```" and again "```". Whether or not my idea is more like a tag doesn't seem to have any repercussions. Outsourcing ever more complexity into the parser, with bad design decisions however has a significant cost, which is making development of parsers and grammars difficult.