Trying to get error backtraces in Rust libraries right

Remix clone Hacker News

new | show | ask | jobs Github

▲ Trying to get error backtraces in Rust libraries right(iroh.computer)

78 points by emschwartz 7 days ago | 62 comments

▲ conaclos 4 days ago | parent | next [-]

The following articles may also be of interest to the audience:

- The definitive guide to error handling in Rust [0]

- Error Handling for Large Rust Projects - Best Practice in GreptimeDB [1]

- Designing error types in Rust [2]

- Modular Errors in Rust [3]

[0]: https://www.howtocodeit.com/articles/the-definitive-guide-to... [1]: https://greptime.com/blogs/2024-05-07-error-rust [2]: https://mmapped.blog/posts/12-rust-error-handling [3]: https://sabrinajewson.org/blog/errors

▲ nromiun 4 days ago | parent | prev | next [-]

I remember people being so excited to use error values instead of exceptions in mainstream languages. But this libraries plus error values combo looks even more complex to me.

▲

photon_garden 3 days ago | parent | next [-]

Their code is more complex in some ways (for example, it’s verbose).

But in languages with exceptions, if you want to know how a function can fail, you have two options:

- Hope the documentation is correct (it isn’t)

- Read the body of the function and every function it calls

Reasonable people can disagree on the right approach here, but I know which I prefer.

▲

jmux 3 days ago | parent | next [-]

> Hope the documentation is correct (it isn’t)

real

compared to every exception-based language I’ve used, rust error handling is a dream. my one complaint is async, but tbh I don’t think exceptions would fare much better since things like the actor model just don’t really support error propagation in any meaningful way

▲

vbezhenar 3 days ago | parent | prev | next [-]

Every function can fail with StackOverflowError and you can't do anything about it.

Almost every function can fail with OutOfMemoryError and you can't do anything about it.

I've accepted that everything can fail. Just write code and expect it to throw. Write programs and expect them to abort.

I don't understand this obsession with error values. I remember when C++ designers claimed, that exceptions provide faster code execution for happy path, so even for systems language they should be preferred. Go error handling is bad. Rust error handling is bad. C error handling is bad, but at least that's understandable.

▲

frumplestlatz 3 days ago | parent | next [-]

This is silly. We can avoid stack overflow by avoiding unbounded recursion.

In user-space, memory overcommit means that we will almost or literally never see an out of memory error.

In kernel space and other constrained environments, we can simply check for allocation, failure and handle it accordingly.

This is a terrible argument for treating all code as potentially failing with any possible error condition.

▲

vbezhenar 2 days ago | parent [-]

> We can avoid stack overflow by avoiding unbounded recursion.

Only if you control the entire ecosystem, from application to every library you're ever going to use. And will not work for library code anyway.

> In user-space, memory overcommit means that we will almost or literally never see an out of memory error.

Have you ever deployed an application to production? Where it runs in the container, with memory limits. Or just good old ulimit. Or any language with VM or GC which provides explicit knobs to limit heap size (and these knobs are actually used by deployers).

> This is a terrible argument for treating all code as potentially failing with any possible error condition.

This is reality. Some programs can ignore it, hand-waving possibilities out, expecting that it won't happen and it's not your problem. That's one approach, I guess.

	▲	frumplestlatz 2 days ago \| parent [-]
		> Only if you control the entire ecosystem, from application to every library you're ever going to use. And will not work for library code anyway. Two edge cases existing is a terrible argument for creating a world in which any possible edge case must also be accounted for. >> This is a terrible argument for treating all code as potentially failing with any possible error condition. > This is reality. Some programs can ignore it, hand-waving possibilities out, expecting that it won't happen and it's not your problem. That's one approach, I guess. No, it’s not reality. It’s the mess you create when you work in languages with untyped exception handling and with people that insist on writing code the way you suggest.

▲

whatevaa 2 days ago | parent | prev | next [-]

Rust and GO has panics for these. Most of the time there is nothing application can do by itself, either there is a bug in application or there is actually shortage of memory and only the OS can do anything about it.

I'm not talking about embedded or kernels. Different stories.

▲

malkia 3 days ago | parent | prev | next [-]

^^^ - This - my recent one, came to the realization that dealing with memory mapped files is much harder without exceptions (not that exceptions make it easier, but at least possible).

Why? Let's say you've opened a memory mapped file, you've got pointer, and hand this pointer down to some library - "Here work there" - the library thinks - oh, it's normal memory - fine! And then - physical block error happens (whether it's Windows, OSX, Linux, etc.) - and now you have to handle this from... a rather large distance - where "error code" handling is not enough - and you have to use signal handling with SIGxxx or Windows SEH handling, or whatever the OS provides

And then you have languages like GoLang/Rust/others where this is a pain point (yes you can handle it), but how well?

If you look in ReactOS the code is full with `__try/__except` - https://github.com/search?q=repo%3Areactos%2Freactos+_SEH2_T... - because user provided memory HAVE to be checked - you don't want exception happening at the kernel reading bad user memory.

So it's all good and fine, until you have to face this problem... Or decide to not use mmap files (is this even possible?).

Okay, I know it's just a silly little thing I'm pointing here - but I don't know of any good solution off hand...

And even handling this in C/C++ with all SEH capabilities - it still sucks...

▲

vlovich123 3 days ago | parent [-]

If the drive fails and you get a signal it’s perfectly valid to just let the default signal handler crash your process. Signals by definition are delivered non-locally, asynchronously, and there’s generally nothing to try/catch or recover. So handling this in Rust is no different than any other language because these kinds of failures never result in locally handleable errors.

	▲	malkia 20 hours ago \| parent [-]
		That's not true - you can handle this pretty well with exceptions (yes it's nagging that you have to add them, but doable)... Not so much without.

▲

tialaramex 3 days ago | parent | prev [-]

> Every function can fail with StackOverflowError and you can't do anything about it.

> Almost every function can fail with OutOfMemoryError and you can't do anything about it.

In fact we can - though rarely do - prove software does not have either of these mistakes. We can bound stack usage via analysis, we usually don't but it's possible.

And avoiding OOM is such a widespread concern that Rust-for-Linux deliberately makes all allocating calls explicitly fallible or offers strategies like Vec::push_within_capacity a method which, if it succeeds pushes the object into the collection, but, if it's full rather than allocate (which might fail) it gives back the object - no, you take it.

▲

malkia 3 days ago | parent | prev | next [-]

Or have checked exceptions (Java). Granted this comes with big downer... If you need to extend functionality and new (updated) code has to throw new exception, your method signature changes :(

But the best so far method I know.

▲

baq 3 days ago | parent [-]

Checked exceptions are not very different from the Result type in that regard TBH.

▲

malkia 3 days ago | parent [-]

As in the caller would be forced/know what to handle in advance? Is this really the case (I'm not sure) - e.g. you call something and it returns Result<T, E> but does it really enforce it... What about errors (results) that came from deeper?

I'm not trying to defend exceptions, nor checked ones, just trying to point out that I don't think they are the same.

For all intent and purposes I really liked Common Lisp's exception handling, in my opinion the perfect one ("restartable"), but it also comes with lots of runtime efficiency (and possibly other) cost (interoperability? safety (nowadays)...) - but it was valiant effort to make programmer better - e.g. iterate while developing, and while it's throwing exceptions at you, you keep writing/updating the code (while it's running), etc - probably not something modern day SRE/devops would want (heh "who taught live updating of code of running system is great idea :)" - I still do, but I can see someone from devops frowning - "This is not the version I've pushed"...)

	▲	vlovich123 3 days ago \| parent [-]
		> but does it really enforce it It warns you if you ignore handling a Result because the type is annotated with must_use (which can be a compile error in CI if you choose to enforce 0 warnings). Not that this is true with try/catch - no one forces you to actually do anything with the error. > What about errors (results) that came from deeper? Same as with exceptions - either you handle it or propagate it up or ignore it.

▲

cwillu 3 days ago | parent | prev | next [-]

And with error values, you also need to hope the documentation for what the error means is correct (it isn't), and read the body of the function and every function it calls to see where the error value actually came from and what it actually means. It's the same problem, but you get to solve a bonus logic puzzle trying to figure out where the error came from.

▲

nromiun 3 days ago | parent | prev [-]

Or catch the top level function and see every exception in your project? Tell me which language does not have a top level main function?

▲

zaphar 3 days ago | parent | next [-]

This is the "I don't care what fails nor do I wish to handle them" option. Which for some use cases may be fine. It does mean that you don't know what kinds of failures are happening nor what the proper response to them is, though. Like it or not errors are part of your domain and properly modeling them as best you can is a part of the job. Catching at the top level still means some percentage of you users are experiencing a really bad day because you didn't know that error could happen. Error modeling reduces that at the expense of developer time.

▲

dingi 3 days ago | parent | next [-]

Top-level error handling doesn't mean losing error details. When done well, it uses specialized exceptions and a catch–wrap–rethrow strategy to preserve stack traces and add context. Centralizing errors provides consistency, ensures all failures pass through a common pipeline for logging or user messaging, and makes policies easier to evolve without scattering handling logic across the codebase. Domain-level error modeling is still valuable where precision matters, but robust top-level handling complements it by catching the unexpected and reducing unhandled failures, striking a balance between developer effort and user experience.

	▲	zaphar a day ago \| parent [-]
		If you are actually using specialized exceptions and a catch-wrap-rethrow strategy then you are doing error modeling and you aren't "Just letting them bubble up to the top" which is basically making my point for me.

▲

nromiun 3 days ago | parent | prev [-]

"I don't care what fails" means not catching any exception/error. My comment was the exact opposite of the idea. Top level function will bubble up every exception, no matter how deep or from which module.

▲

_flux 3 days ago | parent [-]

But the case when you actually learn what errors can happen is when your users start complain about them, not because you somehow knew about it beforehand.

Or maybe you have 100% path coverage in your test..

▲

nromiun 3 days ago | parent [-]

So you are talking about bugs that don't get caught in development? That happens in Rust as well. Borrow checker does not catch every bug or error. A random module you are using could throw a panic and you would not know with Rust (or any language for that matter), until your users trigger those bugs.

	▲	_flux 3 days ago \| parent [-]
		It sure does happen. So should we simply give up? Or should we aspire to have tools to reduce those bugs? Knowing what kind of errors can occur is one of those tools.

▲

johannes1234321 3 days ago | parent | prev [-]

Even better: just let it crash an get a core dump with full context information rather than some log missing information.

But often some "expected" errors can be handled in some way better (retry, ask user, use alternate approach, ...)

▲

csomar 4 days ago | parent | prev [-]

It is not more complex. It is a complex problem and raising exceptions is akin to just giving up (vs. properly handling and reporting the errors).

▲

nromiun 4 days ago | parent [-]

That is akin to saying using a GC is giving up on safe memory allocation, and the borrow checker is the only solution to a complex problem. Exceptions are handled gracefully in real world projects as well, otherwise Python, Java etc would have died a long time ago.

▲

tcfhgj 3 days ago | parent | next [-]

Maybe you are not giving that up, but you are giving up doing memory management "properly", i.e. use more memory and CPU time than necessary for convenience.

▲

nromiun 3 days ago | parent [-]

Like writing only in assembly is "proper" programming? Using more memory and CPU time than necessary for convenience?

▲

tcfhgj 3 days ago | parent [-]

systems programming languages compile right down to machine code

▲

nromiun 3 days ago | parent [-]

Both borrow checker and GC use malloc/free internally as well. According to you there is no difference between the two.

▲

tcfhgj 3 days ago | parent [-]

yes, there is: GC will not use stack allocation and it will add another layer of memory management resulting in significant memory overhead (runtime overhead)

▲

nromiun 3 days ago | parent [-]

Of course they do. For example C#, D, Nim etc all use stack allocation with their GC.

Not to mention Rust also allocates dynamic objects on the heap. So not sure what is your point.

	▲	tcfhgj 3 days ago \| parent [-]
		The language may allow it, but it's not GC managed then. My point is runtime overhead. In C# structs and their refs (including a simple borrow checker to detect invalid ref use) were introduced to escape GC management und reduce it's runtime impact on the programs

▲

csullivannet 3 days ago | parent | prev [-]

Maybe this is my Golang dev leaking, but I intuitively thought that `try: / except:` in Python is essentially the same thing as `if err != nil`, just my IDE doesn't scream at me if I don't catch them.

▲ b_e_n_t_o_n 3 days ago | parent | prev | next [-]

Idk, I'd take

   if err != nil {
      return nil, err
   }

over any of this :)

	▲	throwawayqqq11 2 days ago \| parent [-]
		And now you introduce null and err checks all over your code base. Rust tries to do this ergonomically with the ?-operator, which is not that easy in complex cases but reduces your cognitive load tremendously when possible.

▲ AstralStorm 4 days ago | parent | prev | next [-]

More on how best practices in Rust are not documented anywhere but in a bunch of unfindable blog posts. :(

And even then, there's no guarantee what you find is not outdated.

▲

tialaramex 4 days ago | parent | next [-]

I'm not sure how this idea makes coherent sense. Best practices are going to be determined by users, so even in a language where there's a BDFL or some ludicrous committee flying to Hawaii to discuss everything in person a central "Best practice" document is not actually best practices it's just somebody's opinion but with a rubber stamp.

A central dogma approach just means you need more doublethink, the "best practices" still gradually change but now you're pretending they've always been this way even as they shift beneath your feet.

	▲	tayo42 3 days ago \| parent \| next [-]
		Something like pep8 is useful. I get what poster means. Everytime I come back to rust I find it hard to know what the latest thing and way to do something is.
	▲	m11a 4 days ago \| parent \| prev [-]
		Think you need a healthy mix? eg: if you want distributed systems, esp Java-style, https://www.martinfowler.com is a pretty handy reference. (similar for other areas of SWE). It’s nice to have a single resource that documents a bunch of non-obvious but still-generally-accepted engineering practices relevant to a domain. It’s of course an opinion, but is reasonably uncontroversial. Or at least, you won’t go too wrong while you develop your own taste in said domain.

▲

m11a 4 days ago | parent | prev | next [-]

For Rust, there are a few blogs that lean more on fundamentals and language design (like https://without.boats/blog, https://smallcultfollowing.com/babysteps).

For misc Rust engineering, like the OP, I agree it’s quite scattered. I personally like to save good ones in my Feedbin as I encounter them

▲

csomar 4 days ago | parent | prev | next [-]

Rust has the Error type. Libraries are add-ons and not part of the language.

	▲	tialaramex 3 days ago \| parent [-]
		Well, Rust's standard library provides two types named Error, their full names are core::fmt::Error (aka std::fmt::Error) and std::io::Error But, neither of these is intended as "the" error type - unless you're a formatter or an I/O abstraction they're not the error types you want. Rust provides a trait core::error::Error (aka std::error::Error) and that might be what you're thinking about. It's recommended that your own private error type or types should implement this trait but nothing requires this in most contexts.

▲

imtringued 4 days ago | parent | prev [-]

This has very little to do with Rust.

The entire software development industry has slept on error handling without exceptions and Rust developers are the first ones to actually start addressing the problems.

This problem should have been solved decades ago by C/C++ developers.

	▲	baq 3 days ago \| parent [-]
		They are not first by any means, but the popular languages definitely focus on the happy path with varying but consistently insufficient degrees of enforcement of handling deviations… because that’s easy to iterate on. Java e.g. would benefit from a strict null mode, but the legions of half baked ‘engineers’ wouldn’t comprehend how to write software when you can’t initialize a reference to null and only set it later.

▲ quotemstr 4 days ago | parent | prev | next [-]

And we slowly circle back to what Rust should have done all along: exceptions with Java/Python-style causal chaining.

I wonder how loudly "the community" would scream at me if I published something that just used panics for all error reporting.

▲

dwattttt 4 days ago | parent | next [-]

I've been following Raymond Chen's recent series on writing a tracking C++ pointer class with interest.

Most of the articles start with "to fix the mistake we showed at the end of the last article", and end with "but now we've broken something else".

Needing to keep track of where exceptions can occur, so that you don't leave an operation half committed, sounds especially nasty: https://devblogs.microsoft.com/oldnewthing/20250827-00/?p=11...

▲

nromiun 4 days ago | parent | next [-]

A lot of things are broken in C++. Like coroutines, exceptions, parallel execution (std::execution) etc. That does not mean the core ideas are bad and we should stop using them in every language.

	▲	tialaramex 3 days ago \| parent [-]
		I suspect too much is broken (well, I'd say more clearly "crap") in C++ to be sure whether any particular core ideas hold up based on that whole language. I'm particularly mindful of C++ 26 Erroneous Behaviour for initialization. This idea was introduced for the forthcoming C++ 26 language version, it says that although just making a variable `int k;` and then later taking its value is an error, it's no longer Undefined Behaviour, the compiler shall arrange that it has some specific value. This is a meaningful improvement over the C++ status quo. But, that doesn't mean the core idea is actually good. It's bad to do this in your language, they didn't have any better choice for C++ for historical reasons so this was the least bad option.

▲

quotemstr 3 days ago | parent | prev | next [-]

You still have to be exception safe in Rust, you know. What do you think a panic is? Rust really has the worst of both worlds.

	▲	dwattttt 3 days ago \| parent [-]
		Agreed, and I wish there were more emphasis on ensuring panic free sections of Rust code.

▲

lenkite 3 days ago | parent | prev [-]

Yeah, well C++ built a faulty bridge and screwed the pooch for future language designers who now all say "bridges are harmful".

▲

ninkendo 4 days ago | parent | prev [-]

Except the java/python style with unchecked exceptions means an exception can happen at any time, there's no way to know in advance. This is what Rust is trying to avoid with the errors-as-values approach.

It has its drawbacks, yes, but I'd never go back to the wild-west YOLO approach of unchecked exceptions, personally.

▲

zaphar 3 days ago | parent [-]

To be fair, Rust could have done CheckedExceptions like Java has but no one uses. The problematic version is the RuntimeException. I think the real problem was that when Rust conceived of Result they didn't constrain the problem to just error handling and made it a little bit too much "anything goes". Which means that trying to shoehorn backtraces in after the fact with `?` and `try_into` is now hard. There could have been a world where `Result::Err` was actually a wrapper type that specified an optional source error for backtracing and the generic type was embedded in that instead. It would have been less flexible but it would have made proper backtraces more tractable.

▲

ninkendo 3 days ago | parent [-]

Is there a language that has proper exclusively checked exceptions? That is, not just syntax sugar around checking an error value (à la Swift), but actual “the processor signals an exception” semantics, but all exceptions are still enforced to be handled-or-passed by the compiler?

Honest question, because I can’t think of any. I can see it being advantageous to have checked-only exceptions but there has to be a good reason why it’s so rare-to-never that we see it.

I’m not sure how else you’d get the holy grail, which I’d define as:

1. The compiler enforces that you either handle an exception or pass it to the caller

2. Accurate and fine-grained stack traces on an error (built-in, not opt-in from some error library du jour)

3. (ideally) no runtime cost for non-exception paths (no branches checking for errors, exceptions are real hardware traps)

C++ has 2 and 3, Java has only 2 (because RuntimeException exists), Rust has only 1. I’d love a language with 1 and 2, but all 3 would be great.

	▲	zaphar 2 days ago \| parent \| next [-]
		I can't think of any either. A sibling commenter suggests maybe Eiffel but I haven't really tried or looked at that language so I don't know if it's true. I think having all 3 would be great but if I can only choose one of them I personally prefer #1.
	▲	malkia 3 days ago \| parent \| prev [-]
		I've heard Eiffel has them better, but haven't used it. And then Koka too, but I have zero experience... Someon might shine light here about this...

▲ sbt 4 days ago | parent | prev [-]

> The core issue is that Rust still hasn't stabilized backtrace propagation on errors.

I would actually strengthen this to: «The core issue with Rust is that Rust still hasn’t stabilized _»

	▲	tialaramex 4 days ago \| parent [-]
		Really? Premature stabilization is a much more recognisable problem. This is after all quietly why there's so much Rust adoption. Safety is nice, but perf is $$$. C++ had this unhealthy commitment to stabilization and it meant that their "Don't pay for what you don't use" doctrine has always had so many asterisks it's practically Zalgo text or like you're reading an advert for prescription medicine. You're paying for that stabilization, everywhere, all the time, and too bad if you didn't need it. Unwinding stabilizations you regret is a lot more work even in Rust, consider the Range types, landing improved Range types is considerable work and then they'll need an Edition to change which types you get for the syntax sugar like 1..=10 Or an example which happened a long time ago, the associated constants. There are whole stdlib sub-packages in Rust which exist either primarily or entirely as a place for constants of a type, std::f32::MAX is just f32::MAX but the associated constant f32::MAX only finally stabilized in 1.43 so older code uses std::f32::MAX nevertheless if you learned to write std::f32::MAX you may only finally get deprecation messages years from now and they're stuck with keeping the alias forever of course because it's stable.