▲ | spyrja 7 days ago | ||||||||||||||||||||||||||||||||||||||||
Just as an example of what I am talking about, this is my current UTF-8 parser which I have been using for a few years now.
Not exactly "simple", is it? I am almost embarrassed to say that I thought I had read the spec right. But of course I was obviously wrong and now I have to go back to the drawing board (or else find some other FOSS alternative written in C). It just frustrates me. I do appreciate the level of effort made to come up with an all-encompassing standard of sorts, but it just seems so unnecessarily complicated. | |||||||||||||||||||||||||||||||||||||||||
▲ | simonask 7 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
That's a reasonable implementation in my opinion. It's not that complicated. You're also apparently insisting on three-letter variable names, and are using a very primitive language to boot, so I don't think you're setting yourself up for "maintainability" here. Here's the implementation in the Rust standard library: https://doc.rust-lang.org/stable/src/core/str/validations.rs... It even includes an optimized fast path for ASCII, and it works at compile-time as well. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
▲ | danhau 2 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
I don't know what your code is doing exactly. For comparison, here's my utf8 decoder (for a single codepoint):
|