Remix.run Logo
Joker_vD 7 hours ago

> Obviously I'm gonna be biased, but I'm pretty sure my version is also objectively superior:

> - I can easily make mine from theirs

That... doesn't make it superior? On the contrary, theirs can't be easily made out of yours, except by either returning trivial 1-byte chunks, or by arbitrary buffering. So their proposal is a superior primitive.

On the whole, I/O-oriented iterators probably should return chunks of T, otherwise you get buffer bloat for free. The readv/writev were introduced for a reason, you know.

robby_w_g 6 hours ago | parent | next [-]

> So their proposal is a superior primitive.

This lines up with my thinking. The proposal should give us a building block in the form of the primitive. I would expect the grandparent comment’s API to be provided in a library built on top of a language level primitive.

conartist6 5 hours ago | parent [-]

How would you then deal with a stream of UTF8 code points? They won't fit in a UInt8Array. There will be too many for async iterators to perform well: you'll hit the promise thrashing issues discussed in the blog post

Joker_vD 5 hours ago | parent [-]

No, you'll just need to (potentially) keep the last 1-2 bytes of the previous chunk after each iteration. Come on, restartable UTF-8 APIs has been around for more than 30 years.

conartist6 5 hours ago | parent [-]

But those code points were just inputs to another stream transformation that turns a stream of code points into a stream of graphemes. Rapidly your advice turns into "just do everything in one giant transformation" and that loses the benefits of streams, which are meant to be highly composable to create efficient, multi-step transformation pipelines.

idle_zealot 5 hours ago | parent | next [-]

What's stopping you from implementing a stream transformation that reads the raw stream like a parser, outputting a grapheme or whatever unit you want only when it knows it's done reading it from the input?

Joker_vD 5 hours ago | parent | prev [-]

No, it doesn't turn into this. Those two bytes of leftovers plus a flag are kept inside the stream generator that transforms bytes into code points, every time you pull it those two bytes are used as an initial accumulator in the fold that takes the chunk of bytes and yield chunk of code points and the updated accumulator. You don't need to inline it all into one giant transform.

Come on, it's how (mature libraries of) parser combinators work. The only slightly tricky part here is detecting leftover data in the pipeline.

conartist6 4 hours ago | parent [-]

To quote the article:

> If you want to stream arbitrary JavaScript values, use async iterables directly

OK, so we have to do this because code points are numbers larger than 8 bits, so they're arbitrary JS values and we have to use async iterables directly. This is where the amount of per-item overhead in an async iterable starts to strangle you because most of the actual work being done at that point is tearing down the call stack between each step of each iterator and then rebuilding it again so that the debugger has some kind of stack traces (if you're using for await of loops to consume the iterables that is).

conartist6 7 hours ago | parent | prev [-]

As an abstraction I would say it does make mine superior that it captures everything theirs can and more that theirs can't.

Plus theirs involves the very concrete definition of an array, which might have 100 prototype methods in JS, each part of their API surface. I have one function in my API surface.

6 hours ago | parent [-]
[deleted]