In my mind, the biggest advantage to using tail recursion over vanilla loops is the ability to keep using persistent data structures without any (apparent) mutation.

At least in theory, a tail recursive call will be converted into a dumb jump, so there shouldn't be a performance penalty, but since from your code's perspective you're passing in the stuff for the next iteration, you can keep using the pretty and easy-to-reason-about structures without creating any kind of mutable reference.

I'm not 100% sold on tail recursion in a broader sense, but at least with Clojure's loop/recur stuff it is kind of cool to be able to keep using persistent data structures across iterations of loops.

▲ weitendorf 5 days ago | parent | next [-]

What makes tail recursion "special" is that there exists a semantically equivalent mutable/iterative implementation to something expressed logically as immutable/recursive. [0]

Of course, this means that the same implementation could also be directly expressed logically in a way that is mutable/iterative.

func pow(uint base, uint n): n == 0 ? return 1 : return n * pow(base, n-1)

is just

func pow(uint base, uint n): uint res = 1; for(i=0; i<n; i++){ res *= n} return res

There is no real "advantage" to, or reason to "sell" anybody on tail call recursion if you are able to easily and clearly represent both implementations, IMO. It is just a compiler/runtime optimization, which might make your code more "elegant" at the cost of obfuscating how it actually runs + new footguns from the possibility that code you think should use TCO actually not (because not all immutable + recursive functions can use TCO, only certain kinds, and your runtime may not even implement TCO in all cases where it theoretically should).

As an aside, in C++ there is something very similar to TCO called copy-elision/return-value-optimization (RVO): [1]. As with TCO it is IMO not something "buy into" or sell yourself on, it is just an optimization you can leverage when structuring your code in a way similar to what the article calls "continuation passing style". And just like TCO, RVO is neat but IMO slightly dangerous because it relies on implicit compiler/runtime optimizations that can be accidentally disabled or made non-applicable as code changes: if someone wanders in and makes small semantic to changes to my code relying on RVO/TCO for performance they could silently break something important.

[0] EXCEPT in practice all implementation differences/optimizations introduce observable side effects that can otherwise impact program correctness or semantics. For example, a program could (perhaps implicitly) rely on the fact that it errors out due to stack overflow when recursing > X times, and so enabling TCO could cause the program to enter new/undesirable states; or a program could rely on a functin F making X floating point operations taking at least Y cycles in at least Z microseconds, and not function properly when F takes less than Z microseconds after enabling vectorization. This is Hyrum's Law [2].

[1] https://en.wikipedia.org/wiki/Copy_elision#Return_value_opti...

[2] https://www.hyrumslaw.com/

▲

nothrabannosir 5 days ago | parent | next [-]

> func pow(uint base, uint n): n == 0 ? return 1 : return n * pow(base, n-1)

Nitpick: that’s not tail recursive. Something like def pow(base, n, acc=1) = match n case 0 => acc; default => pow(base, n-1, acc*base)

	▲	pyrale 4 days ago \| parent [-]
		Given that `base` is never used, I suspect that's not the only issue :)

▲

eru 5 days ago | parent | prev | next [-]

> Of course, this means that the same implementation could also be directly expressed logically in a way that is mutable/iterative.

Yes, compilers exist.

> There is no real "advantage" to, or reason to "sell" anybody on tail call recursion if you are able to easily and clearly represent both implementations, IMO.

Avoiding mutation avoids headaches.

> [0] EXCEPT in practice all implementation differences/optimizations introduce observable side effects that can otherwise impact program correctness or semantics. For example, a program could (perhaps implicitly) rely on the fact that it errors out due to stack overflow when recursing > X times, and so enabling TCO could cause the program to enter new/undesirable states; or a program could rely on a functin F making X floating point operations taking at least Y cycles in at least Z microseconds, and not function properly when F takes less than Z microseconds after enabling vectorization. This is Hyrum's Law [2].

These programs are likely not standards compliant. (And that's not just true for the C++ standard but for basically any language with a standard.)

> And just like TCO, RVO is neat but IMO slightly dangerous because it relies on implicit compiler/runtime optimizations that can be accidentally disabled or made non-applicable as code changes:

Who says TCO has to be always implicit? In eg Scheme it's explicit in the standard, and in other languages you can have annotations.

(Whether a call is in tail position is more or less a property you can ascertain syntactically, so your annotation doesn't even have to be understood by the compiler: the linter is good enough. That would catch your 'accidental changes' part.

Things get more complicated, when you have implicit clean-ups happen after returning from the function. Like calling destructors in C++. Then the function call is arguably not in a tail position anymore, and so TCO doesn't apply. Whether this is detectable reliably at compile time depends on the details of your language.)

▲

ahartmetz 4 days ago | parent [-]

Avoiding mutation avoids headaches, but the real headaches are actions (mutations) at a distance, and tail recursion vs loops make no difference there.

▲

eru 4 days ago | parent [-]

No mutation (and no side-effects in general) means no action at a distance.

Loops need mutation to work. The mutation might be benign or it might be headache-inducing. Without further analysis you don't know. If there's no mutation, no need for analysis. Lowering the cognitive load.

Well, replacing loops with tail calls is one tool to get rid of some mutations.

It's basically the same reasoning people give you for not using goto in your program: yes, there are ways to use gotos responsible, and there's ways to end up with spaghetti code.

If you use goto, the reader has to analyse and figure out whether you made spaghetti (and the reader can never be quite sure she understood everything and didn't miss an important caveat). If you express yourself without goto, the need for that analysis largely goes away.

▲

ahartmetz 3 days ago | parent [-]

I have a similar attitude about const in C++, I use it almost whenever possible. Less (possible) mutation to worry about. But I also let go of it fairly easily when it gets in the way. And... well... tail recursion doesn't feel very idiomatic in C++.

If you iterate by const reference over a const container, and you make every assign-once variable in the loop body const (or in Rust: just not mut), is there any advantage to tail recursion except someone on the internet said it's the proper functional style?

I think functional programming contains some great ideas to keep state under control, but there is no reason to ape certain superficial aspects. E.g. the praise of currying in Haskell tutorials really grinds my gears, I think it's a "clever" but not insightful idea and it really weirds function signatures.

▲

eru 3 days ago | parent [-]

> If you iterate by const reference over a const container, and you make every assign-once variable in the loop body const (or in Rust: just not mut), is there any advantage to tail recursion except someone on the internet said it's the proper functional style?

Function calls can express all kinds of useful and interesting control flow. They are so useful that even people who love imperative programming use functions in their language. (Early and primitive imperative programming languages like very early Fortran and underpowered dialects of BASIC didn't have user defined functions.)

So we established that you want functions in your language anyway. Well, and once you properly optimise function calls, what's known as tail call optimisation, you notice that you don't need no special purpose loops (nor goto) built into your language. You can define these constructs as syntactic sugar over function calls. Just like you can define other combinators like map or filter or tree traversals.

See how in the bad old days, Go had a handful of generic functions and data structures built-in (like arrays), but didn't allow users to define their own. But once you add the ability for users to define their own, you can remove the special case handling.

And that's also one thing C++ does well: as much as possible, it tries to put the user of the language on the same footing as the designers.

When 'map' or 'filter' are the best construct to express what you want to say, you should use them. When a 'for'-loop is the best construct, you should use it. (And that for-loop could be defined under the hood as syntactic sugar on top of function calls.) The scenario your concocted is exactly one where a foreach-loop shines.

Though to be a bit contrarian: depending on what your loop does, it might be useful to pick an ever more constrained tool. Eg if all you do run one action for each item, with no early return and you are not constructing a value, you can use something like Rust's 'foreach' (https://docs.rs/foreach/latest/foreach/). If you transform a container into another container (and no early return etc), you can use 'map'. Etc.

The idea is to show the reader as much as possible what to expect without forcing them to dive deep into the logic. The transformation in a 'map' might be very complicated, but you know the shape of the result immediately from just spying that it's a 'map'.

When you see the for-loop version of the above, you have to wade through the (complicated) body of the loop just to convince yourself that there's no early return and that we are producing a new container with exactly the same shape as the input container.

> I think functional programming contains some great ideas to keep state under control, but there is no reason to ape certain superficial aspects. E.g. the praise of currying in Haskell tutorials really grinds my gears, I think it's a "clever" but not insightful idea and it really weirds function signatures.

Yes, that's mixing up two separate things. Haskell doesn't really need currying. All you need for Haskell to work as a language is a convenient way to do partial application. So if Haskell (like OCaml) used tuples as the standard way to pass multiple arguments, and you had a syntactically convenient way to transform the function (a, b, c) -> d into (b, c) -> d by fixing the first argument that would get you virtually all of the benefits Haskell gets from pervasive currying without the weird function signatures.

In practice, people tend to get used to the weird function signatures pretty quickly, so there's not much pressure to change the system.

	▲	pklausler 2 days ago \| parent [-]
		Even the first release of FORTRAN had statement functions.

▲

gleenn 5 days ago | parent | prev | next [-]

I would argue having the parameters that change during the loop be explicit is a very nice advantage. Agree that the things can be equivalent in terms of execution but the readability and explicitness, and the fact that all the parameters are listed in the same place is great.

▲

weitendorf 5 days ago | parent [-]

Agreed. Some people really like FP a lot, and I think it's underrated that the kinds of functions where TCO is applicable tend to be so simple that they are not really that inelegant when expressed imperatively. My true opinion is that relying on TCO is usually choosing ideological adherence to FP (or "code that looks cooler") over reliability/performance/communication.

That said, just as I'd expect experienced C++ programmers to be able to recognize others' code using RVO and be careful not to restructure things to break it, I'd concede that experienced FP programmers might be unlikely to accidentally break others' TCO. It's just that ad absurdum you cannot expect every developer to be able to read every other developers' mind and recognize/workaround all implicit behavior they encounter.

	▲	eru 5 days ago \| parent [-]
		> [...] and I think it's underrated that the kinds of functions where TCO is applicable tend to be so simple that they are not really that inelegant when expressed imperatively. I suspect you are only thinking of patterns that are basically equivalent to a loop. I might agree with that. TCO really shines when you want to implement state machines. See eg https://news.ycombinator.com/item?id=43076088 for an example where using tail calls in Python's interpreter loop gives nice performance benefits. Similar also for LUA. > [...] I'd concede that experienced FP programmers might be unlikely to accidentally break others' TCO. Compiler (or linter) checkable annotations would help here. You are right that you want to make it possible for programmers to statically assert somehow that their function call is a tail call. Btw this reminds me: recursion isn't just something you do in computation, but also in data structures (amongst other things). In eg Rust the members of your data structure are typically just laid out one after another, but when you have a recursive structure (and in certain other cases) you need to box it, otherwise you'd get an infinitely large data structure. Boxing is more or less equivalent to using indirection via a pointer. However, unboxing isn't really like TCO. It's more like in-lining.

▲

vlovich123 4 days ago | parent | prev | next [-]

RVO and URVO are slightly different from TCO in that’s the language guarantees that they are required to happen. You are correct though that small changes can accidentally turn it off unintentionally.

▲

connicpu 4 days ago | parent | prev | next [-]

With Named RVO (I.e. you explicitly `return named_variable;`) copy-elision is actually guaranteed by the standard. I believe returning the return value of a function call is also guaranteed to not do a copy constructor. Anything else is compiler and optimization level dependent.

▲

gpderetta 4 days ago | parent | prev [-]

To nitpick a bit, NRVO is an optimization as there is no guarantee that it will be performed, but RVO is now guaranteed (you can now return temporary non-copyable /non-movable objects for example).

▲ ramchip 5 days ago | parent | prev | next [-]

I think you're confusing mutation and variable reassignment?

▲ tombert 5 days ago | parent | next [-]

I'm saying that if I do a regular loop, something needs to explicitly mutate, either a reference or a value itself.

`for (var i = 0; i < n, i++)`, for example, requires that `i` mutate in order to keep track of the value so that the loop can eventually terminate. You could also do something like:

    var i = new Obj(); 
    while (!i.isDone()) {
        //do some stuff
        i = new Obj(); 
    }

There, the reference to `i` is being mutated.

For tail recursion, it's a little different. If I did something like Factorial:

    function factorial(n, acc) {
        if (n <= 1) return acc;
        return factorial(n - 1, acc * n);
    }

Doing this, there's no explicit mutation. It might be happening behind the scenes, but everything there from the code perspective is creating a new value.

In something like Clojure, you might do something like

    (defn vrange [n]
      (loop [i 0 v []]
        (if (< i n)
          (recur (inc i) (conj v i))
          v)))

(stolen from [1])

In this case, we're creating "new" structures on every iteration of the loop, so our code is "more pure", in that there's no mutation. It might be being mutated in the implementation but not from the author's perspective.

I don't think I'm confusing anything.

[1] https://clojure.org/reference/transients

▲ eru 5 days ago | parent | next [-]

You could imagine a language like eg Python with a for-each loop that creates a new variable for each run through the loop body.

Basically, just pretend the loop body is a lambda that you call for each run through the loop.

It might make sense to think of the common loops as a special kind of combinator (like 'map' and 'filter', 'reduce' etc.) And just like you shouldn't write everything in terms of 'reduce', even though you perhaps could, you shouldn't write everything in terms of the common loops either.

Make up new combinators, when you need them.

For comparison, in Haskell we seldom use 'naked' recursion directly: we typically define a combinator and then use it.

That often makes sense, even if you only use the combinator once.

▲ ramchip 4 days ago | parent | prev | next [-]

> There, the reference to `i` is being mutated.

That's rebinding. Mutation is when you change the state of an object. Variables are not objects. You can't have a reference (aka pointer) pointing to a variable.

	▲	mrkeen 4 days ago \| parent \| next [-]
		I don't know if you're referring to a particular language's peculiarities, but it doesn't really matter. It's mutation. tombert's point (I think) is that in a functional setting, factorial(n) is always factorial(n). In an imperative setting, first it's 1, then 2, then 6, then 24, etc. Here's factorial calculated imperatively in C#. `public async Task Factorial() { long primitiveResult = 1L; Wrapped<long> objectResult = new Wrapped<long>(1L); var observer = new Task(ObserveVariable); observer.Start(); for (var i = 2; i <= 15; i++) { primitiveResult = i; objectResult = new Wrapped<long>(objectResult.Value i); await Task.Delay(100); } observer.Wait(); return; void ObserveVariable() { while (primitiveResult != 1307674368000 \|\| objectResult.Value != 1307674368000) { Thread.Sleep(100); } } }` There are two results (one primitive and one object) to show that it doesn't matter. Maybe there's no such thing as a Reference-to-a-Variable in whatever language you have in mind, but in the above code ObserveVariable refers to a variable (while it mutates).
	▲	tombert 4 days ago \| parent \| prev \| next [-]
		I mean this is getting into splitting-hairs territory, but I would argue that rebinding is a form of mutation; the variable i is changing as far as the programmer is concerned. If I were to write `var i = new Obj(2) // do stuff 1 i = new Obj(3) // do stuff 2` Then yes, that’s technically rebinding and not directly mutating the value in memory, but from “do stuff 2’s” perspective the value has changed. It still acts more or less the same as if it had directly changed.
	▲	1718627440 4 days ago \| parent \| prev [-]
		> You can't have a reference (aka pointer) pointing to a variable. What? `int a; int * p = &a; ?`

▲ gpderetta 4 days ago | parent | prev [-]

  for(int i : some-container) {
     do-something-with(i);
  }

Where is the mutation?

▲ mrkeen 4 days ago | parent [-]

It's implied by void-returning do-something-with(...).

Otherwise you're just burning cycles and a good compiler should dead-code eliminate the whole loop.

(see https://news.ycombinator.com/item?id=44873488)

	▲	gpderetta 4 days ago \| parent [-]
		We are discussing mutation in the loop itself, but sure: `for(int x: container) { yield frob(x); }`

▲ mrkeen 5 days ago | parent | prev [-]

Nope. Loops are unnecessary unless you have mutation. If you don't mutate, there's no need to run the same code again: if the state of the world did not change during iteration 1, and you run the same code on it again, the state of the world won't change during iteration 2.

▲ charcircuit 5 days ago | parent | prev [-]

>without creating any kind of mutable reference.

The parameter essentially becomes a mutable reference.

▲

eru 5 days ago | parent | next [-]

No. There's no mutation happening.

Of course, a compiler might do whatever shenanigans it has to do. But that's an implementation detail and doesn't change how the language works.

(And let's not pretend that CPUs execute something that resembles an imperative language like C under the hood. That might have been true during the PDP11 or a VAX days. These days with out-of-order execution, pipelining etc what's actually happening doesn't really resemble one-after-another imperative code much.)

	▲	dgb23 5 days ago \| parent [-]
		I didn’t eat the sandwich, I just expressed its transformation into energy.

▲

mrkeen 5 days ago | parent | prev | next [-]

If it does, then you would be able to express a race condition with it.

EDIT: I re-read parent comment.

>> the biggest advantage to using tail recursion over vanilla loops is the ability to keep using persistent data structures without any (apparent) mutation.

Yes

>> at least with Clojure's loop/recur stuff it is kind of cool to be able to keep using persistent data structures across iterations of loops

There's the implied mutation.

▲

tombert 5 days ago | parent | prev [-]

No disagreement on that but that's an implementation detail and from the coder's perspective there's no mutable references.

▲

charcircuit 5 days ago | parent | next [-]

From the coder's perspective it is. There's just different syntax for assigning to them.

	▲	tombert 5 days ago \| parent [-]
		No, from the coder's perspective it looks like I'm passing an argument into a function. I suppose you could say that arguments in general are "mutable references", and I guess that's not necessarily "untrue", but that's not generally how I visualize it. If I pass a persistent structure into a tail call, it looks like I'm passing a copy into there, in the same way that I might if I did something like `Math.abs(myValue * 3)`; the compiler converts it to a mutable reference but I see it as the same as passing a value.

▲

weitendorf 5 days ago | parent | prev [-]

From the coder's perspective, there are no mutable references only if the coder does not really rely on or care about the possibility that their code uses TCO. If they actively want TCO then they definitely care about the performance benefits they get from underlying mutability/memory reuse/frame elision.

	▲	tombert 5 days ago \| parent [-]
		Yeah but that's why I prefer Clojure's loop/recur thing. You're not allowed to have non-tail recursion there, so it's not implicit, so you really can pretend that there's no mutable references.