| ▲ | Galanwe 3 days ago |
| > Seasoned Rust coders don’t spend time fighting the borrow checker My experience is that what makes your statement true, is that _seasoned_ Rust developers just sprinkle `Arc` all over the place, thus effectively switching to automatic garbage collection. Because 1) statically checked memory management is too restrictive for most kinds of non trivial data structures, and 2) the hoops of lifetimes you have to go to to please the static checker whenever you start doing anything non trivial are just above human comprehension level. |
|
| ▲ | hu3 3 days ago | parent | next [-] |
| I did some quick search, not sure if this supports or denies your point: - 151 instances of "Arc<" in Servo: https://github.com/search?q=repo%3Aservo%2Fservo+Arc%3C&type... - 5 instances of "Arc<" in AWS SDK for Rust https://github.com/search?q=repo%3Arusoto%2Frusoto%20Arc%3C&... - 0 instances for "Arc<" in LOC https://github.com/search?q=repo%3Acgag%2Floc%20Arc%3C&type=... |
| |
|
| ▲ | andrewl-hn 3 days ago | parent | prev | next [-] |
| `Arc`s show up all over the place specifically in async code that targets Tokio runtime running in multithreaded mode. Mostly this is because `tokio::spawn` requires `Future`s to be `Send + 'static`, and this function is a building block of most libraries and frameworks built on top of Tokio. If you use Rust for web server backend code then yes, you see `Arc`s everywhere. Otherwise their use is pretty rare, even in large projects. Rust is somewhat unique in that regard, because most Rust code that is written is not really a web backend code. |
| |
| ▲ | khuey 3 days ago | parent [-] | | > `Arc`s show up all over the place specifically in async code that targets Tokio runtime running in multithreaded mode. Mostly this is because `tokio::spawn` requires `Future`s to be `Send + 'static`, and this function is a building block of most libraries and frameworks built on top of Tokio. To some extent this is unavoidable. Non-'static lifetimes correspond (roughly) to a location on the program stack. Since a Future that suspends can't reasonably stay on the stack it can't have a lifetime other than 'static. Once it has to be 'static, it can't borrow anything (that's not itself 'static), so you either have to Copy your data or Rc/Arc it. This, btw, is why even tokio's spawn_local has a 'static bound on the Future. It would be nice if it were ergonomic for library authors to push the decision about whether to use Rc<RefCell<T>> or Arc<Mutex<T>> (which are non-threadsafe and threadsafe variants of the same underlying concept) to the library consumer. | | |
|
|
| ▲ | qw3rty01 3 days ago | parent | prev | next [-] |
| This is exactly the opposite of what he’s saying, using Arc everywhere is hacking around the borrow checker, a seasoned rust developer will structure their code in a way that works with the borrow checker; Arc has a very specific use case and a seasoned rust developer will rarely use it |
| |
| ▲ | Aurornis 3 days ago | parent | next [-] | | These extreme generalizations are not accurate, in my experience. There are some cases where someone new to Rust will try to use Arc as a solution to every problem, but I haven't seen much code like this outside of reviewing very junior Rust developers' code. In some application architectures Arc is a common feature and it's fine. Saying that seasoned Rust developers rarely use Arc isn't true, because some types of code require shared references with Arc. There is nothing wrong with Arc when used properly. I think this is less confusing to people who came from modern C++ and understand how modern C++ features like shared_ptr work and when to use them. For people coming from garbage collected languages it's more tempting to reach for the Arc types to try to write code as if it was garbage collected. | |
| ▲ | packetlost 3 days ago | parent | prev | next [-] | | Arc<T> is all over the place if you're writing async code unfortunately. IMO Tokio using a work-stealing threaded scheduler by default and peppering literally everything with Send + Sync constraints was a huge misstep. | | |
| ▲ | ekidd 3 days ago | parent | next [-] | | I mostly wind up using Arc a lot while using async streams. This tends to occur when emulating a Unix-pipeline-like architecture that also supports concurrency. Basically, "pipelines where we can process up to N items in parallel." But in this case, the data hiding behind the Arc is almost never mutable. It's typically some shared, read-only information that needs to live until all the concurrent workers are done using it. So this is very easy to reason about: Stick a single chunk of read-only data behind the reference count, and let it get reclaimed when the final worker disappears. | |
| ▲ | vlovich123 3 days ago | parent | prev [-] | | Arc + work stealing scheduler is common. But work stealing schedulers are common (eg libdispatch popularized it). I believe the only alternative is thread-per core but they’re not very common/popular. For what it’s worth zig would look very similar except their novel injectable I/O syntax isn’t compatible with work stealing. Even then, I’d agree that while Arc is used in lots of places in work stealing runtimes, I disagree that it’s used everywhere or that you can really do anything else if you want to leverage all your cores with minimum effort and not having to build your application specialized to deal with that. | | |
| ▲ | packetlost 3 days ago | parent [-] | | Being possible with minimal effort doesn't really preclude it from it not being the default. The issue I have is huge portions of Tokio's (and other async libs) API have a Send + Sync constraint that destroy the benefit of LocalSet / spawn_local. You can't build and application with the specialized thread-per core or single-threaded runtime thing if you wanted to because of pervasive incidental complexity. I don't care that they have a good work-stealing event loop, I care that it's the default and their APIs all expect the work-stealing implementation and unnecessarily constrain cases where you don't use that implementation. It's frustrating and I go out of my way to avoid Tokio because of it. Edit: the issues are in Axum, not the core Tokio API. Other libs have this problem too due to aforementioned defaults. | | |
| ▲ | Arnavion 3 days ago | parent | next [-] | | >You can't build and application with the specialized thread-per core or single-threaded runtime thing if you wanted to because of pervasive incidental complexity. [...] It's frustrating and I go out of my way to avoid Tokio because of it. At $dayjob we have built a large codebase (high-throughput message broker) using the thread-per-core model with tokio (ie one worker thread per CPU, pinned to that CPU, driving a single-threaded tokio Runtime) and have not had any problems. Much of our async code is !Send or !Sync (Rc, RefCell, etc) precisely because we want it to benefit from not needing to run under the default tokio multi-threaded runtime. We don't use many external libs for async though, which is what seems to be the source of your problems. Mostly just tokio and futures-* crates. | | |
| ▲ | packetlost 3 days ago | parent [-] | | I might be misremembering and the overbearing constraints might be in Axum (which is still a Tokio project). External libs are a huge problem in this area in general, yeah. |
| |
| ▲ | vlovich123 3 days ago | parent | prev [-] | | Single-threaded runtime doesn't require Send+Sync for spawned futures. AFAIK Tokio doesn't have a thread-per-core backend and as a sibling intimated you could build it yourself (or use something more suited for thread-per-core like Monoio or Glommio). |
|
|
| |
| ▲ | jasonjmcghee 3 days ago | parent | prev | next [-] | | This is awkward. I've written a fair amount of rust. I reach for Arc frequently. I see the memory layout implications now. Do you tend to use a lot of Arenas? | | |
| ▲ | dminik 3 days ago | parent [-] | | I've not explored every program domain, but in general I see two kinds of program memory access patterns. The first is a fairly generic input -> transform -> output. This is your generic request handler for instance. You receive a payload, run some transform on that (and maybe a DB request) and then produce a response. In this model, Arc is very fitting for some shared (im)mutable state. Like DB connections, configuration and so on. The second pattern is something like: state + input -> transform -> new state. Eg. you're mutating your app state based on some input. This fits stuff like games, but also retained UIs, programming language interpreters and so on on. Using ARCs here muddles the ownership. The gamedev ecosystem has found a way to manage this by employing ECS, and while it can be overkill, the base DOD principles can still be very helpful. Treat your data as what it is; data. Use indices/keys instead of pointers to represent relations. Keep it simple. Arenas can definitely be a part of that solution. |
| |
| ▲ | bilekas 3 days ago | parent | prev [-] | | This is something I have noticed while I'm by no means seasoned enough to consider myself even a mid level, some of my colleagues are and what they tend to do it plan ahead much better or pedantically, as they put it, the worst thing you will end up doing it's trying to change an architectural decision later on. |
|
|
| ▲ | Aurornis 3 days ago | parent | prev | next [-] |
| > thus effectively switching to automatic garbage collection Arc isn't really garbage collection. It's like a reference counted smart pointer like C++ has shared_ptr. If you drop an Arc and it's the last reference to the underlying object, it gets dropped deterministically. Garbage collection generally refers to more complex systems that periodically identify and free unused objects in a less deterministic manner. |
| |
| ▲ | ninkendo 3 days ago | parent | next [-] | | Also importantly, an Arc<T> can be passed to anything expecting a &T, so you’re not necessarily bumping refcounts all over the place when using an Arc. If you only store it in one place, it’s basically equivalent to any other boxed pointer. | |
| ▲ | Arnavion 3 days ago | parent | prev | next [-] | | >Garbage collection generally refers to more complex systems that periodically identify and free unused objects in a less deterministic manner. No, this is a subset of garbage collection called tracing garbage collection. "Garbage collection" absolutely includes refcounting. | | |
| ▲ | simonask 3 days ago | parent [-] | | There’s just no good reason to conflate the two. Rust’s Arc and C++’s std::shared_ptr do not reclaim reference cycles, so you can call it “garbage collection” if you want, but the colloquial understanding is way more useful. | | |
| |
| ▲ | hansvm 3 days ago | parent | prev | next [-] | | That's fair. It's not really a good pattern though. You get all the runtime overhead of object-soup allocation patterns, syntactic noise making it harder to read than even a primitive GC language (including one using ARC by default and implementing deterministic dropping, a pattern most languages grow out of), and the ability to easily leak [0] memory because it's not a fully garbage-collected solution. As a rough approximation, if you're very heavy-handed with ARC then you probably shouldn't be using rust for that project. [0] The term "leak" can be a bit hard to pin down, but here I mean something like space which is allocated and which an ordinary developer would prefer to not have allocated. | | |
| ▲ | Aurornis 3 days ago | parent [-] | | I agree that using an Arc where it's unnecessary is not good form. However, I disagree with generalizations that you can judge the quality of code based on whether or not it uses a lot of Arc. You need to understand the architecture and what's being accomplished. | | |
| ▲ | hansvm 3 days ago | parent [-] | | > disagree with generalizations that you can judge the quality of code based on whether or not it uses a lot of Arc That wasn't really my point, but I disagree with your disagreement anyway ;) Yes, you don't want to over-generalize, but Arc has a lot of downsides, doesn't have a lot of upsides, and can usually be relatively easily avoided in lieu of something with a better set of tradeoffs. Heavy use isn't bad in its own right, but it's a strong signal suggestive of code needing some love and attention. My point though was: If you are going to heavily use Arc, Rust isn't the most ergonomic language for the task, and where for other memory management techniques the value proposition of Rust is more apparent it's a much narrower gap compared to those ergonomic choices if you use Arc a lot. Maybe you have to (or want to) use Rust anyway for some reason, but it's usually a bad choice conditioned on that coding style. |
|
| |
| ▲ | bluGill 3 days ago | parent | prev | next [-] | | Reference counting has always been a way to garbage collect. Those who like garbage collection have always looked down on it because it cannot handle circular references and is typically slower than the mark and sweep garbage collectors they prefer. If you need a referecne counted garbage collector for more than a tiny minotiry of your code, then Rust was probably the wrong choice of language - use something that has a better (mark and sweep) garbage collectors. Rust is good for places where you can almost always find a single owner, and you can use reference counting for the rare exception. | | |
| ▲ | Aurornis 3 days ago | parent [-] | | Reference counting can be used as an input to the garbage collector. However, the difference between Arc and a Garbage Collector is that the Arc does the cleanup at a deterministic point (when the last Arc is dropped) whereas a Garbage Collector is a separate thing that comes along and collects garbage later. > If you need a referecne counted garbage collector for more than a tiny minotiry of your code The purpose of Arc isn't to have a garbage collector. It's to provide shared ownership. There is no reason to avoid Rust if you have an architecture that requires shared ownership of something. These reductionist generalizations are not accurate. I think a lot of new Rust developers are taught that Arc shouldn't be abused, but they internalize it as "Arc is bad and must be avoided", which isn't true. | | |
| ▲ | bluGill 2 days ago | parent [-] | | > whereas a Garbage Collector is a separate thing that comes along and collects garbage later. That is the most common implementation, but that is still just an implementation detail. Garbage collectors can run deterministically which is what reference counting does. > There is no reason to avoid Rust if you have an architecture that requires shared ownership of something. Rust can be used for anything. However the goals are still something good for system programming. Systems programming implies some compromises which makes Rust not as good a choice for other types of programming. Nothing wrong with using it anyway (and often you have a mix and the overhead of multiple languages makes it worth using one even when another would be better for a small part of the problem) > I think a lot of new Rust developers are taught that Arc shouldn't be abused, but they internalize it as "Arc is bad and must be avoided", which isn't true. Arc has a place. However most places where you use it a little design work could eliminate the need. If you don't understand what I'm talking about then "Arch is bad and must be avoided" is better than putting Arc everywhere even though that would work and is less effort in the short run (and for non-systems programming it might even be a good design) |
|
| |
| ▲ | nayuki 3 days ago | parent | prev | next [-] | | > Arc isn't really garbage collection. It's like a reference counted smart pointer Reference counting is a valid form of garbage collection. It is arguably the simplest form. https://en.wikipedia.org/wiki/Garbage_collection_(computer_s... The other forms of GC are tracing followed by either sweeping or copying. > If you drop an Arc and it's the last reference to the underlying object, it gets dropped deterministically. Unless you have cycles, in which case the objects are not dropped. And then scanning for cyclic objects almost certainly takes place at a non-deterministic time, or never at all (and the memory is just leaked). > Garbage collection generally refers to more complex systems that periodically identify and free unused objects in a less deterministic manner. No. That's like saying "a car is a car; a vehicle is anything other than a car". No, GC encompasses reference counting, and GC can be deterministic or non-deterministic (asynchronous). | |
| ▲ | jcelerier 3 days ago | parent | prev | next [-] | | > Arc isn't really garbage collection. It's like a reference counted smart pointer like C++ has shared_ptr. In c++ land this is very often called garbage collection too | |
| ▲ | jandrewrogers 3 days ago | parent | prev [-] | | This still raises the question of why Arc is purportedly used so heavily. I've written 100s of kLoC of modern systems C++ and never needed std::shared_ptr. | | |
| ▲ | pjmlp 3 days ago | parent [-] | | For the same reason Unreal uses one. Large scale teams always get pointer ownership wrong. Project Zero has enough examples. |
|
|
|
| ▲ | steveklabnik 3 days ago | parent | prev | next [-] |
| The only time I use Arc is wrapping contexts for web handlers. That doesn’t mean there aren’t other legitimate use cases, but “all the time” is not representative of the code I read or write, personally. |
|
| ▲ | kibwen 3 days ago | parent | prev | next [-] |
| > _seasoned_ Rust developers just sprinkle `Arc` all over the place No, this couldn't be further from the truth. |
| |
| ▲ | 9rx 3 days ago | parent [-] | | If they aren't sprinkling `Arc` all over, what are they seasoning with instead? | | |
|
|
| ▲ | swiftcoder 3 days ago | parent | prev | next [-] |
| I don't think there are any Arcs in my codebase (apart from a couple of regrettable ones needed to interface with Javascript callbacks in WASM - this is more a WASM problem than a rust problem). |
| |
| ▲ | ChadNauseam 3 days ago | parent [-] | | haha, I was about to leave the exact same comment. how are you finding wasm? I’ve been feeling like rust+react is my new favorite tech stack | | |
| ▲ | swiftcoder 2 days ago | parent [-] | | I love it, but I'm mainly using it for webgl/webgpu stuff, so relatively little interaction with the DOM - I feel like DOM interaction is still kind of painful through rust/wasm |
|
|
|
| ▲ | amw-zero 3 days ago | parent | prev | next [-] |
| How often are you writing non-trivial data structures? |
|
| ▲ | kannanvijayan 3 days ago | parent | prev | next [-] |
| Not sure how seasoned I am, but I reject any comparison to a cooking utensil! I do find myself running into lifetime and borrow-checker issues much less these days when writing larger programs in rust. And while your comment is a bit cheeky, I think it gets at something real. One of the implicit design mentalities that develops once you write rust for a while is a good understanding of where to apply the `UnsafeCell`-related types, which includes `Arc` but also `Rc` and `RefCell` and `Cell`. These all relate to inner mutability, and there are many situations where plopping in the right one of these effectively resolves some design requirement. The other idiomatic thing that happens is that you implicitly begin structuring your abstract data layouts in terms of thunks of raw structured data and connections between them. This usually involves an indirection - i.e. you index into an array of things instead of holding a pointer to the thing. Lastly, where lifetimes do get involved, you tend to have a prior idea of what thing they annotate. The example in the article is a good case study of that. The author is parsing a `.notes` file and building some index of it. The text of the `.notes` file is the obvious lifetime anchor here. You would write your indexing logic with one lifetime 'src: `fn build_index<'src>(src: &'src str)` Internally to the indexing code, references to 'src-annotated things can generally pass around freely as their lifetime converges after it. Externally to the indexing code you'd build a string of the notes text, and passing a reference to that to the `build_index` function. For simple CLI programs, you tend not to really need anything more than this. It gets more hairy if you're looking at constructing complex object graphs with complex intermediate state, partial construction of sub-states, etc. Keeping track of state that's valid at some level, while temporarily broken at another level, is where it gets really annoying with multiple nested lifetimes and careful annotation required. But it was definitely a bit of a hair-pulling journey to get to my state of quasi-peace with Rust's borrow checker. |
|
| ▲ | ViewTrick1002 2 days ago | parent | prev | next [-] |
| > My experience is that what makes your statement true, is that _seasoned_ Rust developers just sprinkle `Arc` all over the place, thus effectively switching to automatic garbage collection. How else would you safely share data in multi-threaded code? Which is the only reason to use Atomic reference counts. |
|
| ▲ | levkk 3 days ago | parent | prev | next [-] |
| Definitely not. Arc is for immutable (or sync, e.g. atomics, mutexes) data, while borrow checker protects against concurrent mutations. I think you meant Arc<Mutex<T>> everywhere, but that code smells immediately and seasoned Rust devs don't do that. |
|
| ▲ | dev_l1x_be 3 days ago | parent | prev | next [-] |
| I am not sure this is true. Maybe with shared async access it is. I rarely use Arc. |
|
| ▲ | bryanlarsen 3 days ago | parent | prev [-] |
| Or more likely, sprinkle .clone() liberally and Arc or an Arc wrapper (ArcSwap, tokio's watch channels, etc) strategically. |