Remix.run Logo
Veserv 9 hours ago

You will generally only stall indefinitely if you are waiting for new data. So, you will actually handle almost every use case if your blocking read/wait also respects the timeout and does the flush on your behalf. Basically, do it synchronously at the top of your event loop and you will handle almost every case.

You could also relax the guarantee and set a timeout that is only checked during your next write. This still allows unbounded latency, but as long as you do one more write it will flush.

If neither of these work, then your program issues a write and then gets into a unbounded or unreasonably long loop/computation. At which point you can manually flush what is likely the last write your program is every going to make which would be a trivial overhead since that is a single write compared to a ridiculously long computation. That or you probably have bigger problems.

toast0 6 hours ago | parent [-]

Yeah, these are all fine to do, but a libc can really only do the middle one. And then, at some cost.

If you're already using an event loop library, I think it's reasonable for that to manage flushing outputs while waiting for reads, but I don't think any of the utilities in this example do; maybe tcpdump does, but I don't know why grep would.

Veserv an hour ago | parent [-]

Sure, but the article is talking about grep, not write() or libc implementations.

grep buffers writes with no flush timeout resulting in the problem in the article.

grep should probably not suffer from the problem and can use a write primitive/library/design that avoids such problems with relatively minimal extra complexity and dependencies while retaining the performance advantages of userspace buffering.

Most programs (that are minimizing dependencies so can not pull in a large framework, like grep or other simple utilities) would benefit from using such modestly more complex primitives instead of bare buffered writes/reads. Such primitives are relatively easy to use and understand, being largely a drop-in replacement in most common use cases, and resolve most remaining problems with buffered accesses.

Essentially, this sort of primitive should be your default and you should only reach for lower level primitives in your application if you have a good reason for it and understand the problems the layers were designed to solve.