Remix.run Logo
eadmund 2 days ago

> I feel like not understanding why JSON won out is being intentionally obtuse.

I didn’t feel like my comment was the right place to shill for an alternative, but rather to complain about JSON. But since you raise it.

> JSON can easily be hand written, edited, and read for most data.

So can canonical S-expressions!

> Canonical S-expressions are not as easy to read and much harder to write by hand; having to prefix every atom with a length makes is very tedious to write by hand.

Which is why the advanced representation exists. I contend that this:

    (urn:ietf:params:acme:error:malformed
     (detail "Some of the identifiers requested were rejected")
     (subproblems ((urn:ietf:params:acme:error:malformed
                    (detail "Invalid underscore in DNS name \"_example.org\"")
                    (identifier (dns _example.org)))
                   (urn:ietf:params:acme:error:rejectedIdentifier
                    (detail "This CA will not issue for \"example.net\"")
                    (identifier (dns example.net))))))
is far easier to read than this (the first JSON in RFC 8555):

    {
        "type": "urn:ietf:params:acme:error:malformed",
        "detail": "Some of the identifiers requested were rejected",
        "subproblems": [
            {
                "type": "urn:ietf:params:acme:error:malformed",
                "detail": "Invalid underscore in DNS name \"_example.org\"",
                "identifier": {
                    "type": "dns",
                    "value": "_example.org"
                }
            },
            {
                "type": "urn:ietf:params:acme:error:rejectedIdentifier",
                "detail": "This CA will not issue for \"example.net\"",
                "identifier": {
                    "type": "dns",
                    "value": "example.net"
                }
            }
        ]
    }
> for an Canonical S-expression, you have to count how many characters you are typing/deleting, and then update the prefix.

As you can see, no you do not.

thayne 2 days ago | parent | next [-]

Your example uses s-expressions, not canonical s-expressions. Canonical s expressions[1] is basically a binary format. Each atom/string is prefixed by a decimal length of the string and a colon. It's advantage over regular s expressions is that there is no need to escape or quote strings with whitespace, and there is only a single possible representation for a given data structure. The disadvantage is it is much harder to write and read by humans.

As for s-expressions vs json, there are pros and cons to each. S-expressions don't have any way to encode type information in the data itself, you need a schema to know if a certain value should be treated as a number or a string. And it's subjective which is more readable.

[1]: https://en.m.wikipedia.org/wiki/Canonical_S-expressions

eadmund a day ago | parent | next [-]

> Your example uses s-expressions, not canonical s-expressions.

I’ve always used ‘canonical S-expressions’ to refer to Rivest’s S-expressions proposal: https://www.ietf.org/archive/id/draft-rivest-sexp-13.html, a proposal which has canonical, basic transport & advanced transport representations which are all equivalent to one another (i.e., every advanced transport representation has a single canonical representation). I don’t know where I first saw it, but perhaps it was intended to distinguish from other S-expressions such as Lisp’s or Scheme’s?

Maybe I should refer to them as ‘Rivest S-expressions’ or ‘SPKI S-expressions’ instead.

> S-expressions don't have any way to encode type information in the data itself, you need a schema to know if a certain value should be treated as a number or a string.

Neither does JSON, as this whole thread indicates. This applies to other data types, too: while a Rivest expression could be

    (date [iso8601]2025-05-24T12:37:21Z)
JSON is stuck with:

    {
      "date": "2025-05-24T12:37:21Z"
    }
> And it's subjective which is more readable.

I really disagree. The whole reason YAML exists is to make JSON more readable. Within limits, the more data one can have in a screenful of text, the better. JSON is so terribly verbose if pretty-printed that it takes up screens and screens of text to represent a small amount of data — and when not pretty-printed, it is close to trying to read a memory trace.

Edit: updated link to the January 2025 proposal.

antonvs a day ago | parent [-]

That Rivest draft defines canonical S-expressions to be the format in which every token is preceded by its length, so it's confusing to use "canonical" to describe the whole proposal, or use it as a synonym for the "advanced" S-expressions that the draft describes.

But that perhaps hints at some reasons that formats like JSON tend to win popularity contests over formats like Rivest's. JSON is a single format for authoring and reading, which doesn't address transport at all. The name is short, pronounceable (vs. "spikky" perhaps?), and clearly refers to one thing - there's no ambiguity about whether you might be talking about a transport encoding instead,

I'm not saying these are good reasons to adopt JSON over SPKI, just that there's a level of ambition in Rivest's proposal which is a poor match for how adoption tends to work in the real world.

There are several mechanism for JSON transport encoding - including plain old gzip, but also more specific formats like MessagePack. There isn't one single standard for it, but as it turns out that really isn't that important.

Arguably there's a kind of violation of separation of concerns happening in a proposal that tries to define all these things at once: "a canonical form ... two transport representations, and ... an advanced format".

wat10000 a day ago | parent | next [-]

JSON also had the major advantage of having an enormous ecosystem from day 1. It was ugly and kind of insecure, but the fact that every JavaScript implementation could already parse and emit JSON out of the box was a huge boost. It’s hard to beat that even if you have the best format in the world.

antonvs a day ago | parent [-]

Haha yes, that does probably dwarf any other factors.

But still, I think if the original JSON spec had been longer and more comprehensive, along the lines of Rivest's, that could have limited JSON's popularity, or resulted in people just ignoring parts of it and focusing on the parts they found useful.

The original JSON RFC-4627 was about 1/3rd the size of the original Rivest draft (a body of 260 lines vs. 750); it defines a single representation instead of four; and e.g. the section on "Encoding" is just 3 sentences. Here it is, for reference: https://www.ietf.org/rfc/rfc4627.txt

wat10000 a day ago | parent [-]

We already see that a little bit. JSON in theory allows arbitrary decimal numbers, but in practice it’s almost always limited to numbers that are representable as an IEEE-754 double. It used to allow UTF-16 and UTF-32, but in practice only UTF-8 was widely accepted, and that eventually got reflected in the spec.

I’m sure you’re right. If even this simple spec exceeded what people would actually use as a real standard, surely anything beyond that would also be left by the wayside.

kevin_thibedeau a day ago | parent | prev [-]

> clearly refers to one thing

Great, this looks like JSON. Is it JSON5? Does it expect bigint support? Can I use escape chars?

antonvs a day ago | parent [-]

You're providing an example of my point. People don't, in general, care about any of that, so "solving" those "problems" isn't likely to help adoption.

To your specific points:

1. JSON5 didn't exist when JSON adoption occurred, and in any case they're pretty easy to tell apart, because JSON requires keys to be quoted. This is a non-problem. Why do you think it might matter? Not to mention that the existence of some other format that resembles JSON is hardly a reflection on JSON itself, except perhaps as a compliment to its perceived usefulness.

2. Bigint support is not a requirement that most people have. It makes no difference to adoption.

3. Escape character handling is pretty well defined in ECMA 404. Your point is so obscure I don't even know specifically what you might be referring to.

thayne a day ago | parent [-]

I agree with most of what you said, but json's numbers are problematic. For one thing, many languages have 64-bit integers, which can't be precisely represented as a double, so serializing such a value can lead to subtle bug if it is deserialized by a parser that only supports doubles. And deserializing in languages that have multiple numeric types is complicated, since the parser often doesn't have enough context to know what the best numeric type to use is.

dietr1ch 2 days ago | parent | prev [-]

The length thing sounds like an editor problem, but we have wasted too much time in coming up with syntax that pleases personal preferences without admitting we would be better off moving away from text.

927 can be avoided, but it's way harder than it seems, which is why we have the proliferation of standards that fail to become universal.

eximius 2 days ago | parent | prev | next [-]

For you, perhaps. For me, the former is denser, but crossing into a "too dense" region. The JSON has indentation which is easy on my poor brain. Also, it's nice to differentiate between lists and objects.

But, I mean, they're basically isomorphic with like 2 things exchanges ({} and [] instead of (); implicit vs explicit keys/types).

josephg a day ago | parent [-]

Yeah. I don’t even blame S-expressions. I think I’ve just been exposed to so much json at this point that my visual system has its own crappy json parser for pretty-printed json.

S expressions may well be better. But I don’t think S expressions are better enough to be able to overcome json’s inertia.

eddythompson80 2 days ago | parent | prev | next [-]

> is far easier to read than this (the first JSON in RFC 8555):

It's not for me. I'd literally take anything over csexps. Like there is nothing that I'd prefer it to. If it's the only format around, then I'll just roll my own.

justinclift a day ago | parent [-]

> Like there is nothing that I'd prefer it to.

May I suggest perl regex's? :)

remram a day ago | parent | prev | next [-]

This doesn't help with numbers at all, though. Any textual representation of numbers is going to have the same problem as JSON.

NooneAtAll3 a day ago | parent | prev | next [-]

> I contend that this is far easier to read than this

oh boi, that's some Lisp-like vs C-like level of holywar you just uncovered there

and wooow my opinion is opposite of yours

michaelcampbell a day ago | parent | prev [-]

> is far easier to read than this

Readability is a function of the reader, not the medium.