Remix.run Logo
BiteCode_dev 4 days ago

You can certainly do it, but since serialization and validation are the main benefit from using Pydantic, I/O are why it exists.

Outside of I/O, the whole machinery has little use. And since pydantic models are used by introspection to build APIs, automatic deserializer and arg parsing, making it fit the I/O is where the money is.

Also, remember that despite all the improved perf of pydantic recently, they are still more expensive than dataclass, themselves more than classes. They are 8 times more expensive to instanciate than regular classes, but above all, attribute access is 50% slower.

Now I get that in Python this is not a primary concern, but still, pydantic is not a free lunch.

I'd say it's also important to state what it conveys. When I see a Pydantic objects, I expect some I/O somewhere. Breaking this expectation would take me by surprise and lower my trust of the rest of the code. Unless you are deep in defensive programming, there is no reason to validate input far from the boundaries of the program.

JackSlateur 4 days ago | parent [-]

This is true, there is a performance cost

Apart from what has been said, I find pydantic interesting even in the middle of my code: it can be seen as an overpowered assert

It helps making sure that the complex data structure returned by that method is valid (for instance)

duncanfwalker 4 days ago | parent | next [-]

Yeah, I'd agree with that. Validation rules are like an extension to the type system. Invariants are useful at the edges of a system but also in the core. If, for example, I can be told that a list is non-empty then I can write more clean code to handle it.

In Java they got around the external-dependency-in-the-core-model problem by making the JSR-380 specification that could (even if only in theory) have multiple implementations.

In clojure you don't need to worry about another external dependency because the spec library is built-in. One could argue that it's still a dependency even if it's coming from the standard library. At the point I'd say, why are we worried about this? it's to isolate our core from unnecessary reasons to change.

I get that principled terms it's not right but, if those libraries change on API on a similar cadence to the programming language syntax, then it doesn't impact in practical terms. It's these kind of pragmatic compromises that distinguish Python from Java - after all, 'worse is better'.

codethief 4 days ago | parent | prev [-]

You could also use a TypedDict for that, though?

PEP 764[0] will make them extra convenient.

[0]: https://peps.python.org/pep-0764/

JackSlateur 4 days ago | parent [-]

Typing is declarative

In the end, it ensures nothing and accepts everything

  $ cat test.py
  from typing import TypedDict
  class MyDict(TypedDict):
        some_bool: bool
  print(MyDict(some_bool="test", unknown_field="blabla"))
=>

   $ ./test.py
   {'some_bool': 'test', 'unknown_field': 'blabla'}
codethief 3 days ago | parent [-]

Rolls eyes… Of course you need to use a type checker. Who doesn't these days?

JackSlateur 3 days ago | parent [-]

Of course, I need runtime validation

codethief 2 days ago | parent [-]

Sure, runtime validation is useful – at the boundaries of your domain! After that your type checker should ensure your data has the shape your code expects.

In other words: Parse, don't validate. https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

JackSlateur 10 hours ago | parent [-]

Yes

The boundaries of my domain is everywhere, because stuff are built everywhere. I'm sure I am a bad dev, yet prebuilding all possibles objects I might need everytime I receive an input sounds stupid at best

(btw, this post sounds a lot like "if it compiles, it works", not gonna do this, especially in Python)