Remix.run Logo
kragen 17 hours ago

You can see my workflow with Forth in https://asciinema.org/a/621404, which should help reinforce my point that I'm not an expert.

What I mean is that Forth as a programming language is kind of... not great? Like, it's kind of hard to read and hard to write.

For years I thought this might be just a question of familiarity, but not I'm resigned to the fact that I will probably not learn to read Forth as easily as I can read conventional infix syntax within my natural lifetime. I wrote my first RPN programs on an HP-38E calculator in about 01985, I wrote an RPL program to search my address book on my HP-48GX in the 90s, I wrote a parametric CAD system for laser cutting in PostScript, I wrote a quasi-Forth compiler that compiles itself to machine code, and I think I just have to give up on being able to read

    o ->s dup *  o ->c dup * +  sqrt  400 /  amp !
as easily as

    long amp = sqrt(o->s*o->s + o->c*o->c) / 400.0;
It might still be a problem of familiarity, rather than some kind of objective truth, but it's one I'm going to have to live with. (I'm absolutely sure that the reason I have to sound out Greek words letter by letter instead of reading them instantly the way I do in English, Spanish, French, or Portuguese is 100% a question of familiarity; it's completely implausible that Greek is objectively harder to read. I just have the Latin alphabet wired into my brain by decades of constant practice.)

There's a familiarity problem I flatter myself to think is separate, with Forth's vocabulary; for example, within takes its arguments in the order x min max, and because I've programmed much less in Forth than in other languages, I always have to look things like that up, whereas I know the order of arguments to read() or strcat() without having to do so.

It's not a huge difference from C; Forth has more metaprogramming and reflection power than C, but the syntax is less readable, and it's more error-prone in a variety of ways (parameter passing, recursion, types). Presumably infallible programmers would prefer Forth to C, since those weaknesses would not affect them, and they'd have less code to write. I am far from an infallible programmer.

But a lot of those weaknesses are because Forth is designed as a single language for the whole system: assembler, high-level programming, editor commands, debugger, "shell" commands, the whole works. So you have things like ? which is simply defined as : ? @ . ; and seems kind of goofy from a programming-language perspective—why would you dedicate a precious single-character word to printing out the value of a memory location? How often do you want to do that in the middle of your program? Why not just write @ . instead? Wouldn't ? be more valuable in a switch/case statement or something?

However, in the context where Forth grew up, the sibling of DTSS BASIC and DEBUG.COM and DDT, it makes perfect sense; if you've just tested a word (subroutine) that is supposed to change the value of a variable x, you want to be able to say x ? rather than typing out the whole x @ . phrase. It sounds trivial, but it's actually really important, especially if you can't touch-type, as most programmers couldn't at the time. BASIC did the same thing for the same reason: instead of typing print x or even printx you could type ?x to see the value of x.

Similarly, the lack of syntax is important for things like editor interaction, or interactively poking at hardware registers, or whatever. As Yosef Kreinin wrote in https://yosefk.com/blog/i-cant-believe-im-praising-tcl.html:

> The small overhead [of extra punctuation] is tolerable, though sucky, when you program, because you write the piece of code once and while you're doing it, you're concentrating on the task and its specifics, like the language syntax. When you're interacting with a command shell though, it's a big deal. You're not writing a program – you're looking at files, or solving equations, or single-stepping a processor. I have a bug, I'm frigging anxious, I gotta GO GO GO as fast as I can to find out what it is already, and you think now is the time to type parens, commas and quotation marks?! Fuck you! By which I mean to say, short code is important, short commands are a must.

So, Forth is designed so that you can use it as a command language and a high-level programming language and an assembly language. It's like Robert A. Heinlein's ideal unspecialized Renaissance-man language: it can change a diaper, plan an invasion, butcher a hog, program a computer, etc. This (necessarily in my view) involves some compromises—the best possible result will often be worse as a high-level programming language than a language that's designed for just that, and worse as a command language than a language that's designed for just that, and maybe worse as an assembly language too.

You can make a convincing argument for the general case of this with a 2×2 matrix of candidate language design features:

    ╭────────────┬──────────────────┬──────────────────╮
    │            │    good for      │     bad for      │
    │            │ command language │ command language │
    ├────────────┼──────────────────┼──────────────────┤
    │  good for  │                  │                  │
    │ high-level │        0         │        1         │
    │  language  │                  │                  │
    ├────────────┼──────────────────┼──────────────────┤
    │  bad for   │                  │                  │
    │ high-level │        2         │        3         │
    │  language  │                  │                  │
    ╰────────────┴──────────────────┴──────────────────╯
The argument is simply that the set of candidate language design features that go in boxes 1 and 2 is not exactly the empty set. It would be an astounding coincidence if it were, wouldn't it? And every time you add a feature from box 1 to your language, you make it better as a programming language and worse as a command language, and vice versa for box 2. Omitting a feature from box 1 makes your language better as a command language and worse as a programming language, and vice versa for box 2.

The more difficult argument to make is that the compromises are substantial. A skeptic might wonder whether the only compromises are trivial things like the ? I mentioned above. I think it's an argument Yossi has made well in the post I linked above, which has nothing specifically to do with Forth. Also, though, I think that a lot of Forth's design decisions that are unorthodox for programming languages, such as its lack of typing, its lack of syntax, and its lack of stack frames with local variables, are easily understood as accommodations for interactive use, and I think that they do in fact make it substantially worse as a programming language. This is highly debatable, and debated, but it is my current point of view.

In my view, the REPL somewhat makes up for Forth's weaknesses as a programming language in two ways: first, by allowing you to interactively test your code as you write it, and second, by freeing you from having to write user interface code that does things like parse command lines.

There are a few different ways that Forth encourages writing your code as a ravioli-code soup of tiny one-line definitions. Single-line definitions are easier to test interactively, and statically allocating your local variables allows you to share them between multiple definitions, which reduces the required parameter passing (the abstraction penalty). Implicit parameter passing also reduces the syntactic abstraction penalty of subroutine calls. And, barring inlining compiler optimizations, Forth is faster at calling subroutines than any other language (arguably except for other Forth-like things like FOCAL), reducing the abstraction penalty at runtime as well.

This is both good and bad. Ravioli code is more flexible, because you can call existing definitions in new contexts, but harder to understand, because the definition you're editing might be called from a context you aren't seeing. If you were infallible, this greater composability would enable you to bootstrap from nothing to whatever application you wanted to build with less total code. This makes Forth's drawbacks less serious for throwaway code (which doesn't need to be understood or maintained) and for infallible programmers.

Independent of any of this, the REPL is a huge advantage if you're exploring an unknown hardware platform that might be buggy. You probably need one, whether Forth or something else.

So, that's why I think the REPL is very important for understanding Forth—both its virtues and its vices. It is valuable, but it also makes UX demands on other parts of the language which makes them worse in other ways.

kragen 8 hours ago | parent [-]

I do have one objective thing to say about readability. In a pop infix language like C, Python, Lua, or JS, the expression

    e(d(), c(b, a()))
has fairly clear dataflow: data flows from a and b to c, and from c and d to e. This is knowable even without any previous knowledge of those five identifiers. The RPN version, in languages like Forth, PostScript, and Factor

    a b c d e
can just as well correspond to any of these dataflow patterns:

    a(); b(); c(); d(); e();  //none
    b(a); d(c); e();
    e(d(c, b()), a);
    {a, b, c, d, e} // all going somewhere else together
And many others. You don't know if a or b is consuming something left on the stack from before, either.

On this basis I think it's at least somewhat defensible to claim that stack languages are "less readable": information about the dataflow graph which is easily available in the infix syntax is not present, at least locally. You can reconstruct it by knowing, or guessing, the stack effect of each word. But that's different from just having it plainly written down.

As a result, in Forth and PostScript, I regularly have bugs where I pass a parameter to, or receive a result from, the wrong place. This is not a major practical problem (it's usually pretty easy to figure out in the REPL) but it serves as evidence that stack languages really do require more effort to read and understand than pop infix languages.

Of course, you can make almost exactly the same argument that explicit typing helps readability, and implicit variable capture by closures hurts it. I think there's some merit in that, actually.