Remix.run Logo
rudedogg 2 days ago

This is wishful thinking. It’s the same as other layers we have like auto-vectorization where you don’t know if it’s working without performance analysis. The complexity compounds and reasoning about performance gets harder because the interactions get more complex with abstractions like these.

Also, the more I work with this stuff the more I think trying to avoid memory management is foolish. You end up having to think about it, even at the highest of levels like a React app. It takes some experience, but I’d rather just manage the memory myself and confront the issue from the start. It’s slower at first, but leads to better designs. And it’s simpler, you just have to do more work upfront.

Edit:

> Rust's strategy is problematic for code reuse just as C/C++'s strategy is problematic. Without garbage collection a library has to know how it fits into the memory allocation strategies of the application as a whole. In general a library doesn't know if the application still needs a buffer and the application doesn't know if the library needs it, but... the garbage collector does.

Should have noted that Zig solves this by making the convention be to pass an allocator in to any function that allocates. So the boundaries/responsibilities become very clear.

bob1029 2 days ago | parent | next [-]

Use of a GC does not imply we are trying to avoid memory management or no longer have a say in how memory is utilized. Getting sweaty chasing around esoteric memory management strategies leads to poor designs, not good ones.

rudedogg 2 days ago | parent [-]

> Getting sweaty chasing around esoteric memory management strategies

I’m advocating learning about, and understanding a couple different allocation strategies and simplifying everything by doing away with the GC and minimizing the abstractions you need.

My guess is this stuff used to be harder, but it’s now much easier with the languages and knowledge we have available. Even for application development.

See https://www.rfleury.com/p/untangling-lifetimes-the-arena-all...

pron 2 days ago | parent | next [-]

Arenas are fantastic when they work; when they don't, you're in a place that's neither simple nor particularly efficient.

Generational tracing garbage collectors automatically work in a manner similar to arenas (sometimes worse; sometimes better) in the young-gen, but they also automatically promote the non-arena-friendly objects to the old-gen. Modern GCs - which are constantly evolving at a pretty fast pace - use algorithms that reprensent a lot of expertise gathered in the memory management space that's hard to beat unless arenas fully solve your needs.

PaulHoule 2 days ago | parent | prev [-]

For a lot of everyday programming arenas are "all you need".

pron 2 days ago | parent | prev | next [-]

> It’s the same as other layers we have like auto-vectorization where you don’t know if it’s working without performance analysis. The complexity compounds and reasoning about performance gets harder because the interactions get more complex with abstractions like these.

Reasoning about performance is hard as it is, given nondeterministic optimisations by the CPU. Furthermore, a program that's optimal for one implementation of an Aarch64 architecture can be far from optimal for a different implementation of the same architecture. Because of that, reasoning deeply about micro-optimisations can be counterproductive, as your analysis today could be outdated tomorrow (or on a different vendor's chip). Full low-level control is helpful when you have full knowledge of the exact environment, including hardware details, and may be harmful otherwise.

What is meant by "performance" is also subjective. Improving average performance and improving worst-case performance are not the same thing. Also, improving the performance of the most efficient program possible and improving the performance of the program you are likely to write given your budget aren't the same thing.

For example, it may be the case that using a low-level language would yield a faster program given virtually unlimited resources, yet a higher-level language with less deterministic optimisation would yield a faster program if you have a more limited budget. Put another way, it may be cheaper to get to 100% of the maximal possible performance in language A, but cheaper to get to 97% with language B. If you don't need more than 97%, language B is the "faster language" from your perspective, as the programs you can actually afford to write will be faster.

> Also, the more I work with this stuff the more I think trying to avoid memory management is foolish.

It's not about avoiding thinking about memory management but about finding good memory management algorithms for your target definition of "good". Tracing garbage collectors offer a set of very attractive algorithms that aren't always easy to match (when it comes to throughput, at least, and in some situations even latency) and offer a knowb that allows you to trade footprint for speed. More manual memory management, as well as refcounting collectors often tend to miss the sweet spot, as they have a tendency for optimising for footprint over throughput. See this great talk about the RAM/CPU tradeoff - https://youtu.be/mLNFVNXbw7I from this year's ISMM (International Symposium on Memory Management); it focuses on tracing collectors, but the point applies to all memory management solutions.

> Should have noted that Zig solves this by making the convention be to pass an allocator in to any function that allocates. So the boundaries/responsibilities become very clear.

Yes, and arenas may give such usage patterns a similar CPU/RAM knob to tracing collectors, but this level of control isn't free. In the end you have to ask yourself if what you're gaining is worth the added effort.

rudedogg 2 days ago | parent [-]

I enjoy reading your comments here. Thanks for sharing your knowledge, I'll watch the talk.

> Yes, and arenas may give such usage patterns a similar CPU/RAM knob to tracing collectors, but this level of control isn't free. In the end you have to ask yourself if what you're gaining is worth the added effort.

For me using them has been very easy/convenient. My earlier attempts with Zig used alloc/defer free everywhere and it required a lot of thought to not make mistakes. But on my latest project I'm using arenas and it's much more straightforward.

pron a day ago | parent [-]

Sure, using arenas is very often straightforward, but it also very often isn't. For example, say you have a server. It's very natural to have an arena for the duration of some request. But then things could get complicated. Say that in the course of handling the transaction, you need to make multiple outgoing calls to services. They have to be concurrent to keep latency reasonable. Now arenas start posing some challenges. You could use async/coroutine IO to keep everything on the same thread, but that imposes some limitations on what you can do. If you use multiple threads, then either you need to synchronise the arena (which is no longer as efficient) or use "cactus stacks" of arenas and figure out a way to communicate values from the "child" tasks to the parent one, which isn't always simple (and may not even be super efficient).

In lots of common cases, arenas work great; in lots of common cases they don't.

There are also other advantages unrelated to memory management. In this talk by Andrew Kelley (https://youtu.be/f30PceqQWko) he shows how Zig, despite its truly spectacular partial evaluation, still runs into an abstraction/performance tradeoff (when he talks about what should go "above" or "below" the vtable). When you have a really good JIT, as Java does, this tradeoff is gone (instead, you trade off warmup time) as the "runtime knowns" are known at compile time (since compilation is done at runtime).

chpill 11 hours ago | parent [-]

  When you have a really good JIT, as Java does, this tradeoff is gone
Is there a way to visualize the machine code generated by the JVM when optimizing the same kind of code as the examples shown in the talk you mention? I tried putting the following into godbolt.org, but i'm not sure I'm doing it right:

  public class DontForgetToFlush {
      public static void example(java.io.BufferedWriter w) throws java.io.IOException {
          w.write("a");
          w.write("b");
          w.write("c");
          w.write("d");
          w.write("e");
          w.write("f");
          w.write("g");
          w.flush();
      }
      public static void main(String... args) throws java.io.IOException {
          var os = new java.io.OutputStreamWriter(System.out);
          var writer = new java.io.BufferedWriter(os, 100);
          example(writer);
      }
  }
rkagerer 2 days ago | parent | prev [-]

Convention (as you report Zig does) seems to be a sensible way to deal with the problem.

> Also, the more I work with this stuff the more I think trying to avoid memory management is foolish ... It takes some experience, but I’d rather just manage the memory myself and confront the issue from the start.

Not sure why you're getting downvoted, this is a reasonable take on the matter.