▲ | fidotron 9 hours ago | |||||||
> We need to spend 40 bytes storing a number. But . . . why? Assuming they aren't BigInts or similar these are maximum 8 bytes of actual data. This overhead is ridiculous. Using classes should enable you to be much smaller than the JSON representation, not larger. For example, V8 does it like https://v8.dev/docs/hidden-classes > not parsing numbers until they are used Doesn't this defeat the point of pydantic? It's supposed to be checking the model is valid as it's loaded using jiter. If the data is valid it can be loaded into an efficient representation, and if it's not the errors can be emitted during iterating over it. | ||||||||
▲ | jerf 8 hours ago | parent [-] | |||||||
"But . . . why?" This is CPython. This is how it works. It's not particularly related to JSON. That sort of overhead is put on everything. It just hurts the most when the thing you're putting the overhead on is a single integer. It hurts less when you're doing it to, say, a multi-kilobyte string. Even in your v8 example, that's a JIT optimization, not "how the language works". You break that optimization, which you can do at any moment with any change in your code base, you're back to similar sizes. Boxing everything lets you easily implement the dynamic scripting language's way of treating everything as an Object of some sort, but it comes at a price. There's a reason dynamic scripting languages, even after the JIT has come through, are generally substantially slower languages. This isn't the only reason, but it's a significant part of it. | ||||||||
|