| ▲ | kragen 3 days ago |
| Forth is very enjoyable, and it's always exciting to see someone new discovering it, but it has three big problems. The first is a technical problem: the forte of Forth is self-hosted developer tooling in restricted environments: say, under 256KiB of RAM, no SSD, under 1 MIPS, under 10 megabytes of hard disk or maybe just a floppy. In that kind of environment, you can't really afford to duplicate mechanism very much, and programmers have to adapt themselves to it. So you end up using the same mechanism for fairly disparate purposes, with the attendant compromises. But the results were amazing: on an 8080 with 64KiB of RAM and CP/M you could run F83, which gave you virtual memory, multithreading, a somewhat clumsy WYSIWYG screen editor, a compiler for a language with recursion and structured control flow, an assembler, and a CLI and scripting language for your application. Those environments almost don't exist today. But if you're programming, say, an MSP430 (consider as paradigmatic https://www.digikey.com/en/products/detail/texas-instruments...), you have only 2KiB of RAM, and you could use Mecrisp-Stellaris https://mecrisp.sourceforge.net/ That chip's resources are pretty limited. In a money economy, we measure resources in money; the reason to use a chip with limited resources is to avoid spending money, or to spend less money. That chip costs US$7.40. For US$5.59 you could instead get https://www.digikey.com/en/products/detail/stmicroelectronic...: 100 megahertz, 512MiB of flash, 256KiB of RAM, 50 GPIOs, CAN bus, LINbus, SD/MMC, and so on. And according to Table 33 of https://www.st.com/content/ccc/resource/technical/document/d... it typically uses 1.8μA in standby mode at 25° at 1.7V. That's more than the MSP430's headline 0.1μA from https://www.ti.com/lit/ds/symlink/msp430f248.pdf but it's still low enough for many purposes. (A 220mAh CR2032 coin cell could theoretically supply 1.8μA for 13 years, but only has a shelf life of about 10 years, so the STM32 uses less than the battery's self-discharge current.) That is to say, the niche for such small computers is small and rapidly shrinking. Also, while the microcontroller might have only 2KiB of RAM, the keyboard and screen you use to program it are almost certainly connected to a computer with a million times more RAM and a CPU that runs a thousand times faster. So you could just program it in C or C++ or Rust and run your slow and bloated compiler on the faster computer, which will generate more efficient code for the microcontroller. The cases where you have to build the code on the target device itself are few and far between. Forth was designed to make easy things easy and hard things possible. The second problem is a social one: as a result of the first problem, the people who used Forth for that have mostly fled to greener pastures. The Forth community today consists mostly of Forth beginners who are looking for an artificial challenge: instead of making hard things possible, they want to make easy things hard. There are a few oldtimers left who keep using Forth because they've been using it since it did make hard things possible. But even those oldtimers are a different population from Forth's user base in its heyday, most of whom switched to C or VHDL. Most of us have never written a real application in Forth, and we've never had the religious-conversion experience where Forth made it possible to write something we couldn't have written without Forth. The third problem is also a social one: as a result of the second problem, most Forth tutorials today are written by people who don't really know Forth. I've only briefly skimmed this tutorial, but it seems to be an example of this. For example, I see that it doesn't explain immediate words, much less when to not use immediate words. (If it's ever easier to write something in Forth than in C, it's probably because you can define immediate words, thus extending the language into a DSL for your application in ways that are out of reach of the C preprocessor.) And it doesn't talk about string handling at all, not even the word type, even though string handling is one of the things that Forth beginners stumble over most when they start using Forth (because it doesn't inherently have a heap). So, I hope the author continues to learn Forth, and I hope they extend their tutorial to cover more aspects of it. |
|
| ▲ | mpweiher a day ago | parent | next [-] |
| Forth has always intrigued me as one of those languages (APL and Mumps also come to mind) that appears to have a superpower, for example expressing somewhat complex systems compactly, while at the same time also being flawed enough so that this superpower only appears to be applicable to a small niche. Given the somewhat sorry state of (lack of) expressiveness and accompanying bloat in programming in general, it would be really interesting to see if that is inevitable, so if the superpower is in fact also the flaw, or if it's possible to extract the superpower from the flaw. The way you express Forth's superpower is one I haven't seen so far and seems to point a possible way: > So you end up using the same mechanism for fairly disparate purposes, with the attendant compromises. Can you tell more about those mechanisms that are used for disparate purposes? > If it's ever easier to write something in Forth than in C, it's probably because you can define immediate words, thus extending the language into a DSL for your application in ways that are out of reach of the C preprocessor. So compile-time metaprogramming is not just available as an add-on, but very much "how things are done"? https://www.forth.com/starting-forth/11-forth-compiler-defin... And having a bit of compile-time metaprogramming also be the compiler is enabled by effectively not having syntax? |
| |
| ▲ | kragen 16 hours ago | parent [-] | | I agree about Forth being a fatally flawed language with superpowers, although I think we could easily have ended up in a world where Forth played the role of C, which has its own fatal flaws. Yes, compile-time metaprogramming is very much "how things are done". This is simplified by not having syntax, but I don't think they're inseparable; you could imagine building up a compiler in the same way from an almost-as-minimal base using something like https://jevko.org/, S-expressions, a Prolog-like extensible infix parser, or a Smalltalk-like non-extensible infix parser with an open set of operators. I think most of these would be improvements. PostScript has an only slightly more elaborate syntax than Forth, but uses Smalltalk-style lightweight lambdas (called "quotations" in several other stack languages) to provide control-flow operators through runtime metaprogramming instead of compile-time metaprogramming. As for "mechanisms used for disparate purposes", for example, the outer (text) interpreter in typical Forths plays the role of the Unix shell, the C-level systems programming language, the assembler syntax, and the user interface to applications such as, traditionally, the interactive text editor. And in https://news.ycombinator.com/item?id=45340399 drivers99 reports using it to parse an input file. The Forth language is not a very good shell command language, not a very good high-level programming language, and not a very good text editor user interface language, but it's adequate for all of these purposes. The dictionary, similarly, serves to hold definitions for all those purposes. But it also allocates memory in a region-allocator-like way—a byte at a time, if need be. You can use the same words like , to store data into the dictionary directly, in interpretation state: create myarray 3 , 4 , x ,
Or in a constructor: : throuple create , , , ; 3 4 x throuple myarray
In traditional Forths like F83, , is also the mechanism for adding an xt to a colon definition, but in ANS Forth compile, was added as a possible synonym which would also permit writing Forth code that was portable to non-threaded-code implementations. https://forth-standard.org/standard/core/COMPILECommaThe operand stack serves to pass arguments and return return values, as well as to hold temporaries, but you can often use it to store a local variable as well, and space on it is dynamically allocated, so it's possible to use it to pass or return variable-sized arrays by value. At compile time, it's used to keep track of the nesting of control-flow structures. The return stack serves to store return addresses, but also to store loop counters or maybe another local variable. And return-stack manipulation provides you with a relatively flexible form of runtime metaprogramming for things like stackless coroutines, shallow-bound dynamic scoping, and exception handling. Here's an implementation of dynamic scoping (which cannot be used inside a do loop or when you have other stuff on the return stack): 0 value old 0 value where : co 2r> >r >r ;
: let! dup to where where @ to old ! co old where ! ;
Example usage: decimal : dec. 10 base let! . ;
This temporarily sets base to 10 before calling ., but then restores base to whatever value it had before upon return. A better implementation that uses the return stack instead of old and where to save and restore the values is : (let!) dup @ over swap 2r> rot >r rot >r >r >r ! ; : let! (let!) 2r> ! ;
(This is probably not very understandable, but I've written an 1800-word explanation of it elsewhere which you can read if you like.)Pointer arithmetic and integer arithmetic are the same operation, as they are in most untyped languages. This is different from C, where they are done with the same operators which are implemented differently for integers and for different types of pointers. The "filesystem" in traditional Forths simply exposes the disk as an array of 1024-byte blocks which could be mapped into memory on demand. Conventionally you would divide your code into 1024-byte screenfuls, each space-padded out to 64-character lines, 16 of them. In effect, each screen was a different "file", identified by number rather than name. It's reasonable to argue that this is not a very good filesystem, and not a very good format for text files, but to implement any filesystem on top of a disk or SSD, you need a layer that more or less provides that functionality; all that's required to make it usable for code blocks is to use 1024-byte blocks instead of 128-byte or 512-byte or whatever. Multitasking in traditional Forths is cooperative. In some sense this eliminates the need for locking; for example, to ensure that the block buffer you've mapped your desired block into doesn't get remapped by a different task before you're done using it, you simply avoid calling anything that could yield. Unfortunately, Forth doesn't have colored functions, so there's no static verification that you didn't call anything that calls something that yields. Cooperative multitasking is sort of not very good multitasking (since an infinite loop in any task hangs the system) and not very good locking, but it does serve both purposes well enough to be usable. Scheme is sort of like this too; famously, Scheme's lambda (roughly Forth's create does>) is semantically an OO object, a statement sequencing primitive, a lazy-evaluation primitive, etc., while S-expressions are a similar syntactic cure-all, and call/cc gives you multithreading, exception handling, backtracking, etc. See https://research.scheme.org/lambda-papers/. In practice a small Lisp is about the same amount of code as a small Forth. BTW, I still have a paper of yours in my queue to read! |
|
|
| ▲ | drob518 3 days ago | parent | prev | next [-] |
| Well said. I love Forth an I think it’s worth learning, but almost nobody programs workstation-level applications with it, and as you say, even in embedded environments the level of resources have grown such that there’s very little reason to choose Forth anymore. Which makes me a bit sad because Forth is brilliant. |
| |
|
| ▲ | zelphirkalt 3 days ago | parent | prev | next [-] |
| Reading lines from a file and handling the strings in memory is what made me stop using it after a 3rd day of advent of code one year. I simply couldn't find a good solution, without a massive excursion into how to use the pad. Such a supposedly simple thing like reading a complete line from a file, yet it stopped me completely. Of course I could have "cheated" and put the input right into the program, but I wanted to learn Forth, so I thought I should be able to do this ... Later I read, that GForth 1.0 should have more string handling words, but then I already had lost hope to find an easy solution. Don't get me wrong, learning the little bit of Forth that I did learn, it was quite interesting, and I would have liked to progress more. I think I also lost hope, because I couldn't see how this stack system would ever be able to handle multi-core and persistent data structures. Things that I have come to use in other niche languages. Also that some projects/libraries are one-man shows/bus factor 1, and the maintainers have stopped developing them. They are basically stale and made by people, which significantly more understanding than any beginner will have for a long time. I guess to really learn it, one has to read one of the often recommended books and have a lot of patience, until one gets to any parts, where one learns simple things like reading a file line by line. |
| |
| ▲ | alexisread 3 days ago | parent | next [-] | | You should be able to dive in quickly using the very nice forthkit, which finishes with a working shell / REPL: https://github.com/tehologist/forthkit It is an implementation of eforth, a portable forth: http://www.exemark.com/FORTH/eForthOverviewv5.pdf | |
| ▲ | kragen 3 days ago | parent | prev | next [-] | | I think mostly learning Forth is like learning any other programming language (or, better said, programming environment): you learn by doing it. Books can be a useful complement to practice, but practice is how you learn to do things. You can't learn to do things by reading. As for string handling, in my limited experience, string handling in Forth is a lot like string handling in C; you have to allocate buffers and copy characters between them. memcpy is called move, and memset is called fill. You can use the pad if you want, but you can just as well create inbuf 128 allot and use inbuf. There are two big differences: 1. Forth doesn't have NUL-terminated strings like C does, because it's just as easy to return a pointer and a length from a subroutine as it would be to return just a pointer. This is generally a big win, preventing a lot of subtle and dangerous bugs. (Forth is generally more error-prone than C, but this is an exception.) 2. Forth unfortunately does have something called a "counted string", where the string length is stored in the byte before the string data. You can create them with C" (https://forth-standard.org/standard/core/Cq), and Forth beginners often wonder whether to use counted strings. The answer is no: you should never use counted strings, and they should not have been included in the standard. Use normal strings, created with S" (https://forth-standard.org/standard/core/Sq), unless you are calling word or find. https://forth-standard.org/standard/rationale#rat:cstring goes into some of the history of this. If you want to allocate strings on the heap, which is often the simplest way to handle strings, malloc is called allocate, realloc is called resize, and free is called free: https://forth-standard.org/standard/memory With respect to multicore and persistent data structures (I assume you mean FP-persistent, as in, an old pointer to a data structure is a pointer to the old version of the data structure), stacks aren't really related to them. Each Forth thread has its own operand stack and its own return stack (and sometimes its own dictionary), so they don't really create interactions between different cores. | | |
| ▲ | zelphirkalt 3 days ago | parent [-] | | I think there is another problem for me: The last time I have done any manual memory management a la C, before using Forth was some >10y ago. And immediately the next question would pop up in my head: "What if that line is longer than 128 bytes? Is there no general function to read a whole line?" And I guess then I would reinvent the whole machinery to read a whole line, determining at which byte the newline appears. And then I would have doubts like: "Uh, but what if someone puts some unicode characters in there?". While actually all I wanted was to read a single file, to get working on an AoC puzzle. So I think I lacked the manual memory management basics as well at that point, and any haphazardly implemented hack like "assume the longest line is at most 128 ASCII characters long" would not have made me happy with my code. | | |
| ▲ | kragen 3 days ago | parent [-] | | Well, to bake an apple pie from scratch, you must first create the universe. In any programming language, to read an arbitrarily long line into memory, you need an arbitrarily large computer, so your software may need to pause to convert more Temu orders, continents, asteroids, or star systems into computronium. If you're not willing to go that far, you have basically two choices: 1. Process the line in a streaming fashion rather than holding all of it in memory at once. 2. Only handle lines up to some maximum length. If you select option 2, the only remaining questions are: 2a. What is that maximum length? 2b. What happens if you hit it? Maybe 128 bytes is not a limit you're happy with, but it's just as easy to use 1048576 or 1234567890. Your code may be easier to understand and easier to get right if you use a dynamically-allocated string type (I suggest studying stralloc from qmail 1.03), but don't fool yourself into thinking that that means there's no limit on input line length. Dismayingly often, the answer to 2b in that case is "Linux starts thrashing and becomes unusably slow until you reboot it." (If your input is UTF-8, the line-reading function doesn't have to worry about whether the bytes represent Unicode characters or not, because byte 0x0a will never occur inside a non-ASCII character.) | | |
| ▲ | zelphirkalt 3 days ago | parent [-] | | The point is, I don't want to spend lots of time solving these essential problems, when I actually want to learn the language through solving puzzles. It seems, that Forth does not lend itself to be learned that way, since even very basic things are not provided and require in-depth knowledge of Forth and developing manual memory managed solutions to problems, that are solved in almost every programming language in their standard libraries. If I used Python it would literally be 2 lines of code, and with file.readlines() or so, I don't have to think about how long a line can be and then develop ad-hoc brittle half-solutions. Perhaps readlines() has a limit somewhere too though. Just not aware of it and so far have not needed to deal with that kind of thing. But then again Forth and Python are 2 very different languages and act on another level of abstraction in many cases, so maybe that comparison is not fair. | | |
| ▲ | kragen 2 days ago | parent [-] | | Forth was sort of designed by and for people who did want to solve these essential problems anew for each application. Chuck Moore claimed many times that a tailored ("ad hoc") solution that solves only the part of the problem you need to solve for a particular application would be 10× smaller and simpler than a generalized solution that has to balance the needs of all possible applications. He considered it preferable to not have a lot of library code in your application to solve problems you don't actually have. Maybe your ad-hoc solution is brittle, but it's brittle precisely in ways you know about, not in ways you don't. But you don't have to use Forth that way just because Chuck did. You can totally use a generalized string library in Forth. I don't know which one to recommend, but http://turboforth.net/resources/string_library.html seems to be one possibility. You can be sure that Python's file.readlines()† will have trouble if you try to read a line that is much longer than your RAM size. You can get pretty far with just built-in standard functionality, though: Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
128 constant len create buf len allot ok
: greet ." Name? " buf len accept ." Hello, " buf swap type ." !" ; ok
greet Name? Zelphir Hello, Zelphir! ok
And, as you said, GForth comes with a heap-allocated string library https://gforth.org/manual/String-words.html#String-words which you can use if you first say include string.fs
______† ever since Python 2.0, I'd recommend using list(file) instead of file.readlines(), or just iterate over the file directly, like [line.strip() for line in file if line.startswith('zel')] |
|
|
|
| |
| ▲ | drivers99 2 days ago | parent | prev [-] | | One year (2022) I could see, on an early problem (day 2), that I could define a handful of words in forth such that I could execute the (modified) input file itself as code (there were only 9 possible combinations since it was rock-scissor-paper, although I did have to alter the input by removing the spaces first, like "A X" was changed to "AX") to get the answer. I defined words that matches the 9 inputs and had those do whatever the problem said to do. https://adventofcode.com/2022/day/2 | | |
|
|
| ▲ | jll29 2 days ago | parent | prev | next [-] |
| > Most of us have never written a real application in Forth, and we've never had the religious-conversion experience where Forth made it possible to write something we couldn't have written without Forth. Perhaps someone will upload some Forth source code for a few larger systems e.g. "Fmacs", an Emacs-like editor written in mostly Forth with Forth instead of ELISP being the embedded language. Then it would be interesting to compare speed and readability (important today and every day) as well as memory requirements in RAM and on disk etc. (not so important anymore, used to be very important in the past). I had a look at the little Forth-based operating system's source code and of course couldn't comprehen much, which is obvious because looking at the code doesn't tell you, you need to imagine what's going on with the stack. |
|
| ▲ | eschneider 2 days ago | parent | prev | next [-] |
| Very much this. Even when programming for constrained environments, it's almost never necessary to self-host anymore. It's easy to use host-side tools to crunch code down to something that'll work on whatever the target is. From a practical standpoint, one of the few modern uses where FORTH shines is as a REPL for new chips/SOCs so you can play around with the hardware and see how things actually work/debug the databook. |
| |
| ▲ | kragen 2 days ago | parent [-] | | Have you been using it for that? Which Forths and which chips have you been using? |
|
|
| ▲ | 9fanatic 3 days ago | parent | prev [-] |
| The description of F83 sounds interesting - any way I can see it in action, or use it on my own? |
| |
| ▲ | kragen 3 days ago | parent [-] | | Sure, I git cloned my copy from https://github.com/ForthHub/F83, and it runs fine under DOSBox. If you have Git and DOSBox installed, I think you can just type git clone https://github.com/ForthHub/F83
cd F83
dosbox .
f83
: fish 0 do i . ." fish" cr loop ; 7 fish
| | |
|