Remix.run Logo
zahlman 3 hours ago

> Yes, that's what makes it not zero-copy.

Yeah, so you'd have to pass around the `BytesIO` instead.

I know that zero-copy doesn't ordinarily mean what I described, but that seemed to be how TFA was using it, based on the logic in the rest of the sentence.

woodruffw 2 hours ago | parent [-]

> Yeah, so you'd have to pass around the `BytesIO` instead.

That wouldn’t be zero-copy either: BytesIO is an I/O abstraction over a buffer, so it intentionally masks the “lifetime” of the original buffer. In effect, reading from the BytesIO creates new copies of the underlying data by design, in new `bytes` objects.

(This is actually a great capsule example of why zero-copy design is difficult in Python: the Pythonic thing to do is to make lots of bytes/string/rich objects as you parse, each of which owns its data, which in turn means copies everywhere.)

zahlman 2 hours ago | parent [-]

Fair. (You can `.getbuffer` but you still have to keep the underlying BytesIO object "open" somehow.)

I'm not convinced this is going to bottleneck things, though.

(On the flip side, I guess the OS is likely to cache any disk write in memory anyway.)