Remix.run Logo
kragen 3 days ago

Most languages don't have an explicit stack, and even their implicit stack is only for subroutine calls. If you're not making subroutine calls, your compiled code might not access the stack at all. So, for example, here's the strlcpy function from OpenBSD, lightly edited:

    size_t strlcpy (char *dst, const char *src, size_t siz) {
            register char *d = dst;
            register const char *s = src;
            register size_t n = siz;

            if (n != 0 && --n != 0) {
                    do { if ((*d++ = *s++) == 0) break; } while (--n != 0);
            }

            if (n == 0) {
                    if (siz != 0) *d = '\0';
                    while (*s++)
                            ;
            }

            return(s - src - 1);
    }
GCC 12.2.0 compiles this to the following 18 ARM instructions, with -mcpu=cortex-a53 -Os -S:

            .text
            .align 2
            .global strlcpy
            .syntax unified
            .arm
            .type strlcpy, %function
    strlcpy:
            @ args = 0, pretend = 0, frame = 0
            @ frame_needed = 0, uses_anonymous_args = 0
            @ link register save eliminated.
            mov r3, r1
            cmp r2, #0
            beq .L6
    .L14:
            subs r2, r2, #1
            beq .L3
            ldrb ip, [r3], #1 @ zero_extendqisi2
            strb ip, [r0], #1
            cmp ip, #0
            bne .L14
    .L4:
            sub r0, r3, r1
            sub r0, r0, #1
            bx lr
    .L3:
            mov r2, #0
            strb r2, [r0]
    .L6:
            ldrb r2, [r3], #1 @ zero_extendqisi2
            cmp r2, #0
            bne .L6
            b .L4
            .size strlcpy, .-strlcpy
If you're not familiar with ARM assembly, I'll tell you that nothing in this entire function uses the stack at all, which is possible because strlcpy doesn't call any other functions (it's a so-called "leaf subroutine", also known as a "leaf function") and because ARM, like most RISCs, puts the subroutine return address in a register (lr) instead of on the stack like amd64, or in the called subroutine like the PDP-8, which doesn't have a stack at all. And the calling convention puts arguments and return values in registers as well. So the function can just move data around between memory and registers and decrement its loop counter and increment its pointers without ever touching the stack.

FORTRAN up to FORTRAN 77 didn't support recursion, including indirect recursion, so that you could implement it without a stack.

By contrast, in Forth, instead of registers you use the operand stack. For loop counters you use the return stack. Sometimes you can use the operand stack instead of variables as well, although I think it's usually a better idea to use variables, especially when you're starting to learn Forth—it's much easier for beginners to get into trouble by trying too hard to use the stack instead of variables than to get into trouble by trying too hard to use variables instead of the stack.