I would rather use a loop so I can debug it.

▲ skrishnamurthi 3 days ago | parent | next [-]

This isn't meant to be a good programming mechanism, it's meant to be an illustration of how to use the macro system.

But also, if you're processing non-linear data, you're going to want to do with a recursive function anyway. E.g., when dealing with a tree. Code below; can't seem to get multi-line code-formatting so it looks hideous:

#lang racket

(require "anon-rec.rkt") (require rackunit)

(struct mt ()) (struct node (v l r))

(define sum-tree (lam/anon (t) (cond [(mt? t) 0] [(node? t) (+ (node-v t) ($MyInvocation (node-l t)) ($MyInvocation (node-r t)))])))

(define t (node 5 (node 3 (mt) (mt)) (node 7 (node 9 (mt) (mt)) (mt))))

(check-equal? (sum-tree t) 24)

▲ neilv 3 days ago | parent | next [-]

For formatting code blocks on HN, prefixing each line with 4+ leading spaces works:

    (define sum-tree
      (lam/anon (t)
        (cond ((mt?   t) 0)
              ((node? t) (+ (node-v t)
                            ($MyInvocation (node-l t))
                            ($MyInvocation (node-r t)))))))

	▲	skrishnamurthi 2 days ago \| parent [-]
		Aaah, thanks Neil!

▲ CyberDildonics 3 days ago | parent | prev [-]

Recursion just ends up using the call stack as a stack data structure. I would much rather use an actual stack data structure, that will be easier to debug and have better locality since there isn't an entire call frame overhead to put one value into the stack.

▲

fooker 2 days ago | parent [-]

You’d be right if this was 1950. Since then literally all hardware, and compilers, have this specific use case so optimized that you’ll likely see the opposite if you benchmark it.

▲

CyberDildonics 2 days ago | parent [-]

Prove it. You can put your stack data structure on the stack anyway. A balanced tree isn't going to have more depth than your memory address bit length. Why would copying a single value be slower than pushing an entire call frame to the stack? Locality is what matters and there is no truth to what you're saying.

More important is the debugability. If you have a normal data structure you can see the full stack of values. If you use recursion you have to unwind through multiple call frames and look at each one individually.

Recursion is for people who want to show a neat clever trick, it isn't the best way to program.

▲

fooker a day ago | parent [-]

Here you go : https://godbolt.org/z/haroWGa3c

Recursive DFS took 4 ms Iterative DFS took 8 ms

I'm sure you could optimize the explicit stack based one a bit more to reach parity with a significantly more complex program.

But might as well let ~75 years of hardware, OS, and compiler advancements do that for you when possible.

> Why would copying a single value be slower than pushing an entire call frame to the stack

Because that's not what happens. The stack arithmetic is handled in hardware increasing IPC significantly, and the 'frame' you are talking about it almost the same same size as a single value in the happy path when all the relevant optimizations work out.

> More important is the debugability

Debugging recursive programs is pretty neat with most debuggers. No, you don't unwind through anything manually, just generate a backtrace.

▲

CyberDildonics 15 hours ago | parent [-]

One reason it's slower is because your stack doesn't reserve any memory and grows to 40441 at its max size then shrinks back down again. Stack uses a dequeue by default which stores elements in chunks which likely causes lots of memory allocations (and deallocations) which don't happen in the recursive version. Also at n=80,000 your recursive version blows the stack.

https://en.cppreference.com/w/cpp/container/deque.html

The stack arithmetic is handled in hardware increasing IPC significantly, and the 'frame' you are talking about it almost the same same size as a single value in the happy path when all the relevant optimizations work out.

The program stack isn't magically special, it isn't going to beat writing a single value to memory, especially if that memory is some constant sized array already on the stack.

Debugging recursive programs is pretty neat with most debuggers. No, you don't unwind through anything manually, just generate a backtrace.

No matter what kind of debugger it is you're still going to be looking at a lot of information that contains the values you're looking for instead of just looking at the values directly in an array.

Recursion gets used because it's quick, dirty and clever, not because it's the best way to do it.

▲

fooker 14 hours ago | parent [-]

Go on, 'Prove it'. Write a version that's faster.

I know it's doable, because I have done it.

You don't seem to understand yet how complex it will be. My guess is ~10x the number of lines of code. It'll be significantly less readable, let alone debuggable.

(btw changing from stack to vector and reserving memory outright for the allocations has virtually no change in performance.)

> The program stack isn't magically special

This is what you're missing. Yes, it is magical because the hardware optimizes for that path. That's why it's faster than what you'd think from first principles.

> it isn't going to beat writing a single value to memory

If you examine the kernel trace from this, you'll find that it has the exact same memory usage and bandwidth (and about twice the ipc). Magical, yes.

	▲	CyberDildonics 14 hours ago \| parent [-]
		Go on, 'Prove it'. Write a version that's faster. You're trying to say that pushing a function call frame is faster than writing a single value to memory and incrementing an index, and when asked to prove it you made something that allocates and deallocates chunks of memory to a linked list where every access takes multiple pointer dereferences.

▲ treyd 3 days ago | parent | prev [-]

Tail-recursive functions in Racket are optimized down to essentially for loops.

▲

stirfish 3 days ago | parent [-]

What's it like to debug them?

	▲	rgherdt 2 days ago \| parent [-]
		One strategy is by "tracing" a function call: https://docs.racket-lang.org/trace/