| ▲ | Joker_vD 5 hours ago |
| No, you'll just need to (potentially) keep the last 1-2 bytes of the previous chunk after each iteration. Come on, restartable UTF-8 APIs has been around for more than 30 years. |
|
| ▲ | conartist6 5 hours ago | parent [-] |
| But those code points were just inputs to another stream transformation that turns a stream of code points into a stream of graphemes. Rapidly your advice turns into "just do everything in one giant transformation" and that loses the benefits of streams, which are meant to be highly composable to create efficient, multi-step transformation pipelines. |
| |
| ▲ | idle_zealot 5 hours ago | parent | next [-] | | What's stopping you from implementing a stream transformation that reads the raw stream like a parser, outputting a grapheme or whatever unit you want only when it knows it's done reading it from the input? | |
| ▲ | Joker_vD 5 hours ago | parent | prev [-] | | No, it doesn't turn into this. Those two bytes of leftovers plus a flag are kept inside the stream generator that transforms bytes into code points, every time you pull it those two bytes are used as an initial accumulator in the fold that takes the chunk of bytes and yield chunk of code points and the updated accumulator. You don't need to inline it all into one giant transform. Come on, it's how (mature libraries of) parser combinators work. The only slightly tricky part here is detecting leftover data in the pipeline. | | |
| ▲ | conartist6 4 hours ago | parent [-] | | To quote the article: > If you want to stream arbitrary JavaScript values, use async iterables directly OK, so we have to do this because code points are numbers larger than 8 bits, so they're arbitrary JS values and we have to use async iterables directly. This is where the amount of per-item overhead in an async iterable starts to strangle you because most of the actual work being done at that point is tearing down the call stack between each step of each iterator and then rebuilding it again so that the debugger has some kind of stack traces (if you're using for await of loops to consume the iterables that is). |
|
|