Remix.run Logo
hamstergene 6 hours ago

What I didn’t like about this series of books was choosing “garbage collection” as umbrella term for both tracing GC and reference counting, without verifying if programming community would agree with that, which turned out they didn’t.

I’ve seen a lot of threads here and on reddit where people were arguing about terminology purely because of this book alone.

By that definition, C++ code has garbage collection if it uses std::shared_ptr, going against widespread common usage of the term “garbage collected programming language” which specifically contrasts manual languages like C++ or Rust against garbage collected ones.

“Automatic Memory Management” is a lot more suitable description to what programmers have to do to manage memory; it is now in the title but still hasn’t become the primary term.

pron 6 hours ago | parent | next [-]

> What I didn’t like about this series of books was choosing “garbage collection” as umbrella term for both tracing GC and reference counting, without verifying if programming community would agree with that, which turned out they didn’t.

This has been the standard terminology in memory management research for many decades. The only programmers who don't like it are those who don't understand the principles of memory management.

> By that definition, C++ code has garbage collection if it uses std::shared_ptr

That's right.

> going against widespread common usage of the term “garbage collected programming language” which specifically contrasts manual languages like C++ or Rust against garbage collected ones.

Since this contrast mostly exists in the minds of people who don't understand memory management, going against this common misconception is good. That's not to say that there aren't some interesting tradeoffs that often align with the colloquial perception, "garbage collection" isn't the interesting part. As you said, both C++ and Rust use GC; in fact, they use a GC somewhat similar to the one used by CPython.

gdwatson 4 hours ago | parent [-]

This reminds me a bit of the way academics in programming language theory internalized the type-theoretic definition of the word “type” over and against the traditional programming definition. You sometimes see people who try to correct the term “dynamically typed language,” which makes perfect sense when types are data types, to “untyped” or “unityped,” which makes sense when types are mathematical constructs equivalent to proofs.

The colloquial term is clear in context, and it draws its boundaries in useful places. If academia prefers other boundaries to simplify its formal definitions, that’s understandable. But the rest of us shouldn’t restrict our language in that way.

gf000 2 hours ago | parent [-]

I think GC's definition is pretty clear cut. How is counting references to determine when a lifetime ends materially different from another way of doing the same thing? Like there is even a paper that shows that one is tracking liveness, while the other tracks "deadness" and they are literally going at the same thing from different ends.

If anything, I often see a bias against tracing GCs from the people misusing the term, to "hype up" their choice of language that it must be better for not having (tracing) GC, when it usually just has ref counting which in many metrics is actually worse, given equal usage -- rust/cpp gets away from that because they only use it on a handful of objects, other lifetimes being driven by RAII, which is pretty much just compile-time decidable ref counting?

gdwatson 3 minutes ago | parent | next [-]

I think a lot of people just want to be able to discuss different areas of the automatic memory management design space separately, and maintaining the distinction between reference counting and garbage collection (meaning tracing GCs) lets them do that.

As for me personally, I consider refcounting and GC overlapping categories. I am perfectly willing to call CPython’s reference counting plus cycle collector a form of garbage collection, because it is transparent to the programmer. Every memory management technique has tradeoffs and pathological edge cases, but since you don’t have to consider them in the ordinary course of programming I’d say it counts. If you had to break cycles manually, or to annotate which references should be counted, I’d call that refcounting but not GC – as in the C++ stdlib.

hayley-patton 4 minutes ago | parent | prev [-]

> Like there is even a paper that shows that one is tracking liveness, while the other tracks "deadness" and they are literally going at the same thing from different ends.

https://dl.acm.org/doi/10.1145/1035292.1028982

trumpdong 6 hours ago | parent | prev [-]

The Linux kernel has garbage collection, and not just the controversial refcount kind.