| |
| ▲ | thomasahle 4 days ago | parent | next [-] | | You can do that in python using https://github.com/patrick-kidger/torchtyping looks like this: def batch_outer_product(x: TensorType["batch", "x_channels"],
y: TensorType["batch", "y_channels"]
) -> TensorType["batch", "x_channels", "y_channels"]:
return x.unsqueeze(-1) * y.unsqueeze(-2)
There's also https://github.com/thomasahle/tensorgrad which uses sympy for "axis" dimension variables: b, x, y = sp.symbols("b x y")
X = tg.Variable("X", b, x)
Y = tg.Variable("Y", b, y)
W = tg.Variable("W", x, y)
XWmY = X @ W - Y
| | |
| ▲ | patrickkidger 4 days ago | parent | next [-] | | Quick heads-up that these days I recommend https://github.com/patrick-kidger/jaxtyping over the older repository you've linked there. I learnt a lot the first time around, so the newer one is much better :) | | | |
| ▲ | ydj 4 days ago | parent | prev [-] | | Is there a mypy plugin or other tool to check this via static analysis before runtime? To my knowledge jaxtyping can only be checked at runtime. | | |
| ▲ | thomasahle 4 days ago | parent [-] | | I doubt it, since jaxtyping supports some quite advanced stuff: def full(size: int, fill: float) -> Float[Array, "{size}"]:
return jax.numpy.full((size,), fill)
class SomeClass:
some_value = 5
def full(self, fill: float) -> Float[Array, "{self.some_value}+3"]:
return jax.numpy.full((self.some_value + 3,), fill)
|
|
| |
| ▲ | saghm 4 days ago | parent | prev | next [-] | | > In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known. Yeah, this seems like matrixes might not be a great first example for explaining the value of dependent types. It's fully possible to define a matrix that uses a generic type as the index for each dimension that doesn't allow expressing values outside the expected range; it would just be fairly cumbersome, and the usual issues would creep back in if you needed to go from "normal" integers back to indexes (although not if you only needed to convert the indexes to normal integers). I find that the potential utility of dependent types is more clear when thinking about types where the "dimensions" are mutable, which isn't usually how I'd expect most people to use matrixes. Even a simple example like "the current length of a list can be part of the type, so you define a method to get the first element only on non-empty lists rather than needing them to return an optional value". While you could sort of implement this in a similar as described above with a custom integer-like type, the limitations of this kind of approach for a theoretically unbounded length are a lot more apparent than a matrix with constant-sized dimensions. | | |
| ▲ | jyounker 4 days ago | parent [-] | | The advantage of dependent typing systems is that you can say that the arguments to a matrix multiplication function must have dimensions (m, i) and (i, n) and that result will have dimensions (m, n). | | |
| ▲ | akst 4 days ago | parent | next [-] | | As others have said, this isn't a unique value proposition of dependent types, by all means dependent types provide a solution to this, but so do many other approaches. You can do this in C++ & Rust, and I've done something similar in Typescript. The point of the comment you're replying to is kind of making the point that pitches for dependent types probably should give examples of something not already solved by other systems (not that you can't do this in DT). See: > Yeah, this seems like matrixes might not be a great first example for explaining the value of dependent types. In order to adopt DT, you need to pay a combined cost of productivity loss in terms of established tooling and choose to forgo any human capital which is no longer applicable once you start using a language with Dependent types, and a loss of time spent learning to use a language that has dependent types, which almost certainly has a smaller library ecosystem than most mainstream languages without DT. For that reason it's worth considering what examples illustrate a benefit that outweighs all the above costs. I don't think it's enough to simply assert some moral basis on safety when clearly salaries for work done in unsafe languages are already considerably high. | | |
| ▲ | saghm 4 days ago | parent [-] | | Yeah, right now there isn't really any example of the value tradeoff being high enough that its overcome the barriers to adoption. Examples making the benefits more clear are helpful in being able to talk about dependent types in my opinion because most programmers (including myself beyond a fairly surface-level understanding, honestly) probably aren't familiar with them; once that's been established, I think it's a lot easier to talk about why the approach hasn't yet become mainstream. For example, with the potential utility in having dependent types for list APIs, the barrier is that in practice, it's not always straightforward to be able to define your code in a way where you actually know the size of the list. If I read input from a user, I might be able to do an initial check that a list of items is non-empty as part of the constructor for the type, but what happens if I want to start popping off elements from it? In practice, this seems like it would either force you to go back to manually checking or only ever process things in a purely functional way to avoid defining behavior that vary based on state that isn't known. If you're willing to take this approach, having the compiler enforce it is great! You have to be willing to use this approach often enough for it to be worth picking a language with dependent types though, and in practice it doesn't seem likely that people would choose to even if they were more aware of the option. | | |
| ▲ | akst 4 days ago | parent [-] | | > Examples making the benefits more clear are helpful in being able to talk about dependent types in my opinion because most programmers (including myself beyond a fairly surface-level understanding, honestly) probably aren't familiar with them There's probably some truth in that (I'm probably in this camp as well), though I feel sometimes adoption of things like this can often be a communication problem, and the people who understand these technologies the most may struggle to identify what aspect of these technologies are valued more by the people they are trying to pitch them too. Like they might even find those aspects boring and uninteresting because they are so deep into the technology, that they don't even think to emphasis them. Which feels like a similar story where a technology early to the party didn't take off until someone else came back to them later (like fat structs and data oritented design gaining some interest despite early versions of these ideas date back to Sketchpad (1963)). > For example, with the potential utility in having dependent types for list APIs, the barrier is that in practice, it's not always straightforward to be able to define your code in a way where you actually know the size of the list Yeah this example crossed my mind as well, it's not immediately clear how you deal with IO or data the result of inputs from the outside world. Although I have feeling this has to be a solved problem. I guess more generally with API design, sometimes you don't really know the system you want to model yet so there's diminishing returns in being overly precise in how you model it if those things are subject to change. Not sure how substantive a concern this is when weighing up using dependent types, but it is something that crosses my mind as an outsider to all this dependent type stuff. | | |
| ▲ | marklemay 4 days ago | parent [-] | | The io issue can be solved in 2 ways:
* by returning the pair of the list and it's length.
* having functions that convert between simply typed and dependently typed lists. However the ergonomics could probably be improved. The API design is a real real issue, which is why I think something more "gradual" would be best. | | |
| ▲ | akst 3 days ago | parent [-] | | I haven’t entirely followed what you meant by a solution to IO, but I’m not to surprising to hear there’s a solution. I guess the point I had in mind was, in making the case of dependent languages stuff like this may not be immediately apparent (I might be confusing the concept of total languages or programming which may not necessarily be coupled to dependent types). > as an aside by IO I do mean anything not known at compile time (interrupts & user input, network responses, network requests [who API may change in a breaking way], database entries, etc) Cheers for your insight btw |
|
|
|
| |
| ▲ | eru 4 days ago | parent | prev [-] | | You don't need dependent types for that, either. | | |
| ▲ | codebje 4 days ago | parent [-] | | Not for matrix multiplication, because there's no term-level chicanery to worry about; you just need phantom types for the two dimensions on your matrix type. You can express a lot just carrying phantom types around and using basic pattern matching: in Rust I can express multiplication, concatenation of vectors to matrices, and inverses of square matrices, using phantom types. But I can't say this, or anything remotely like it: struct Matrix<M: u32, N: u32, T> {
dim: (M, N),
// imaginary dependently typed vectors
cells: Vec<M, Vec<N, T>>,
}
impl<M, N, T> Matrix<M, N, T> {
fn mul<O: u32>(&self, &other: Matrix<N, O, T>) -> Matrix<M, O, T> {
let cells: Vec<self.m, Vec<other.o, T>> = do_the_thing();
Matrix {
// any other values in dim, or a different type in cells, would be a type error
dim: (self.m, other.o),
cells
}
}
}
In other words, there's no dependent pair construction without dependent types. (Or at least dependent kinds and some tricks.) | | |
| ▲ | hmry 4 days ago | parent [-] | | Are M and N supposed to be type parameters in your example? If so, then you don't need to store them in the struct. And once you remove that (and switch from dynamically sized vectors to statically sized arrays) your example becomes perfectly representable in Rust and C++ without dependent types | | |
| ▲ | codebje 4 days ago | parent [-] | | That's the "carry phantom types around" option, but it only gets you so far. If you never need to compute with the term, it's enough (see https://play.rust-lang.org/?version=stable&mode=debug&editio...). You could use statically sized arrays, but if you can't have the static sizes of the arrays as your type parameters, but instead some arbitrary M and N, the size of your data structure isn't part of your type and you don't get the statically verified safety benefits. You also can't express M + N or anything of the sort. Neither C++ nor Rust really hurt for expressivity, though. |
|
|
|
|
| |
| ▲ | zozbot234 4 days ago | parent | prev | next [-] | | Yes, the point of dependent types is that they give you the ability to do some sort of almost arbitrary (though not strictly Turing-complete) "computation" as part of type checking, which essentially dispenses with the phase separation between "compiling" and "running" code - or at least makes compile-time computation unusually powerful. So if you want to replicate the properties of dependent typing in these existing languages you'll need to leverage their existing facilities for compile-time metaprogramming. > A proper dependently-typed language can prevent runtime out-of-bound errors even if neither the indices nor the bounds are known at type check time. Yes, but the way this is done is by threading a proof "this index is within bounds" throughout the code. At runtime (e.g. within 'extracted' code, if you're using common dependently-typed systems), this simply amounts to relying on a kind of capability or ghost 'token' that attests to the validity of that code. You "manufacture" the capability as part of an explicit runtime check when needed (e.g. if the "index" or "bounds" come from user input) and simply rely on it as part of the code. | | |
| ▲ | gpderetta 4 days ago | parent [-] | | You can do arbitrary computations as part of type checking in C++, yet I don't think it should be considered dependently typed. It seems to me that dependent typing strictly requires going from runtime values to types. (You can parametrize types by runtime values in c++ in a trivial sense, by enumerating a finite set a compile time and then picking the correct type at runtime according to a runtime value, still I don't think it counts as the set of valid types would be finite). | | |
| ▲ | zozbot234 4 days ago | parent [-] | | The "runtime" values of a dependent type system are not quite runtime values either, because the computation is happening at compile time. When you "extract" a program from a dependently-typed to an ordinary programming language that can compile down to a binary, the dependent types are simply erased and replaced with ordinary, non-dependent types; though there may be "magic" casts in the resulting program that can only be proven correct via dependent types. | | |
| ▲ | wk_end 4 days ago | parent [-] | | It's not that simple, right? Because the computation can actually rely on the types, so the types can't always just be erased. The complexity around knowing when the types are going to be erased or not is part of what motivated Idris 2's adoption of quantities. https://idris2.readthedocs.io/en/latest/updates/updates.html... | | |
| ▲ | codebje 4 days ago | parent | next [-] | | QTT is about clarity on erasing values rather than types, eg, if you define 'Vect n a' do you need to set aside space in the memory structure for a vector for 'n' or not. The types themselves are not represented. If, eg, you wanted to implement something that printed out its own type, like: vname : Vect n a -> String
vname = "Vect " ++ show n ++ " " ++ nameOf a
Then (1) you'd be requiring "n" to be represented at run-time (so a QTT value of "unrestricted"), and (2) you'd need a type class to find "nameOf" for "a" when executing, because the actual type of "a" is gone. At runtime a type class means a table of methods passed to the function so it can call the right "nameOf" implementation. class Named a where
nameOf a : String
vname : Named a => Vect n a -> String
≡ vname : String -> Vect n a -> String
| |
| ▲ | ChadNauseam 4 days ago | parent | prev [-] | | In a sense, both you and the grandparent comment are right, although it will take some time for me to explain why. 1. There is an extremely strict sense of the concept of type erasure where the grandparent comment is right. 2. But in a practical sense of "I don't want these values in my binary", you are right. So to resolve this confusion, it might be best to say exactly what dependent types are. "Dependent types", for all their mystery, are just a combination of three simple ideas: 1. What if you could: write functions that return types? 2. What if you could: say that the value of the input to a function can determine the type of the function's output? 3. What if you could: say that the value of the first thing in a tuple can determine the type of the second thing in the tuple? For a concrete example, here is a very simple dependently typed function -- write functions that return types!
pickType : Bool -> Type
pickType True = Nat
pickType False = String
-- the return type changes based on input!
getValue : (b : Bool) -> pickType b
getValue True = 42
getValue False = "hello"
You can see that there's nothing here inherently that requires any of the types or values to be kept around at runtime in order to use this function. In fact, here's how you might use it: useValue : Bool -> String
useValue b = case b of
True => "The number is: " ++ show (getValue b)
False => "The message is: " ++ getValue b
Once again, no type level terms need to be kept around at runtime to evaluate this. (Although you need a fairly intelligent typechecker in order to typecheck it.)As nice as this is, the story gets a little less clean once you look at taking a dependent argument as a value. For example, we could not write this: takeValue : pickType b -> ()
takeValue value = ()
If you try to write this, you would get a classic "using a variable before it's declared" error, because nowhere in this have we defined what `b` is. Instead you'd have to write takeValue : (b : Bool) -> pickType b -> ()
takeValue b value = ()
The type system forces you to thread the value of b through your code (as a parameter) to any function that takes a pickType b. So while every type-level term is technically erased at runtime, in the strict sense that the compiler is not adding any extra stuff to the runtime that you didn't write yourself, the value will not truly be "erased" because you are forced to pass it everywhere even though you don't use it.Obviously, this might be undesirable. So then you might further ask, "what are the situations in which we don't really need to enforce this constraint? when can a function take a dependent value without needing to know the values it depends on?" That is where QTT comes in, as developed in Idris 2. Idris actually takes it a step further than we need for this conversation. For our purposes, all we need to do is say that there are two options for arguments into a function: arguments that are never used at runtime, and arguments that may or may not be used at runtime. Let's say a function never uses one of its arguments. Then, clearly, as long as the compiler knows what that argument is, it doesn't have to actually write runtime code to pass it to the function. Let's allow the user of our made up language to prefix types that are never actually used at runtime with `0`. takeValue : (0 b : Bool) -> pickType b -> ()
takeValue b value = ()
Now this should typecheck just fine because we're not actually using b, right? The compiler can easily see that.And in fact, we can even pass b to another function as long as that function also marks its b as 0. (We know it will never actually get used so it's fine.) takeValuePrime : (0 b : Bool) -> pickType b -> ()
takeValuePrime b value = takeValue b value
Now you can finally see where the "erasure" comes in. Any compiler can, as an optimization, erase values that are never used. And in dependently typed languages, it will often be the case that many terms are never actually used at runtime and can therefore be erased. | | |
| ▲ | oggy 4 days ago | parent [-] | | Thank you for spelling this out; comments like these make this website worthwhile. You've enlightened at least one person today. You hinted that there's more to QTT (or its implementation in Idris?) than this. Could you elaborate a bit on what these other features are, and what their purpose is? | | |
| ▲ | ChadNauseam 3 days ago | parent [-] | | QTT supports three different "multiplicities". I discussed "used 0 times at runtime" (can be erased) and "used any amount of times at runtime" (cannot be erased). The remaining one is "used exactly once at runtime". These also cannot be erased, but allowing this constraint allows you to encode some very interesting things in the type system. The classic example is "resource protocols", e.g. file handles that force you to close them (and that you cannot use after closing). | | |
| ▲ | oggy 3 days ago | parent [-] | | I see, so it unifies type (or I suppose term in this case?) erasure and linear typing? Thanks for the explanation! | | |
|
|
|
|
|
|
| |
| ▲ | etbebl 4 days ago | parent | prev | next [-] | | The latter is what would be most useful imo. Even Matlab can type check matrix sizes with constants these days, but I often wish I could use variables to express relationships between the sizes of different dimensions of inputs to a function. | |
| ▲ | lacker 4 days ago | parent | prev | next [-] | | That's a good point, for example in Eigen you can do Eigen::Matrix<float, 10, 5> I just really want it in Python because that's where I do most of my matrix manipulation nowadays. I guess you also would really like it to handle non-constants. It would be nice if these complicated library functions like torch.nn.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None, batch_first=False, device=None, dtype=None) would actually typecheck that kdim and vdim are correct, and ensure that I correctly pass a K x V matrix and not a V x K one. | | |
| ▲ | zozbot234 4 days ago | parent [-] | | Python is a dynamic language where everything happens at run time, including checks of variable types. So you can already do this. | | |
| ▲ | nephanth 4 days ago | parent | next [-] | | Python has static typechecking which, while not perfect, works pretty well as long as you're not actually trying to circumvent it (or manipulating things "too dynamically") | | |
| ▲ | eru 4 days ago | parent [-] | | I think Python actually has multiple, different static typechecking systems, depending on which checker you use. Python itself only gives you type annotations, but doesn't specify what they are supposed to mean. (I think. Please correct me if I am wrong.) | | |
| |
| ▲ | bluGill 4 days ago | parent | prev [-] | | The problem with runtime is when you make a mistake it can be a long time before you find out. Compile time means a large class of problems is prevented without perfect test coverage. On a small project it isn't a big deal but when you have hundgeds of developers over decades you will miss something |
|
| |
| ▲ | tmtvl 4 days ago | parent | prev | next [-] | | In Common Lisp: (defparameter *test-array*
(make-array '(10 5)
:element-type 'Float
:initial-element 0.0))
(typep *test-array* '(Array Float (10 5)))
And the type check will return true. | | |
| ▲ | ndr 4 days ago | parent [-] | | Compile-time checks is what's implied in virtually all these conversations. Does that check run at compile time? | | |
| ▲ | tmtvl 4 days ago | parent [-] | | If I define a function foo which takes an (Array Float (10 5)) and then define a function bar which uses the result of calling foo with an (Array String (3 3)) then SBCL will print a warning. If I don't redefine foo and call bar with such an array then a condition will be raised. It's not quite the same as, say Rust's 'thing just won't compile', but that's due to the interactive nature of Common Lisp (I believe someone called it the difference between a living language and a dead language (though the conventional meaning of living language and dead language in programming is a bit different)). | | |
| ▲ | chongli 4 days ago | parent [-] | | Can you replace 10 5 and 3 3 with variable names and still get the warning without first calling the function? | | |
| ▲ | tmtvl 4 days ago | parent [-] | | No. You can replace them with asterisks to mean 'any length is fine', but you can't at compile time check something that's only known at runtime (granted, you could see that it can't match if the variables are, for example, an (Integer 1 10) and an (Integer 75 100)). Something you can do with variables is: (defvar *example-dimension* 20)
(deftype Example-Array (element-type minor-dimension)
`(Array ,element-type (,*example-dimension* ,minor-dimension)))
...and then you can declare variables to be something like, say, (Example-Array Boolean 5) and have it expand to (Array Boolean (20 5)). But the type declaration needs to be of the form (Array &optional element-type dimensions) where dimensions can be any of the following:- An asterisk, which means anything goes. - A number, which means the array needs to have that many dimensions (so an (Array Fixnum 2) is a 2-dimensional array of fixnums). - A list of numbers, asterisks, or a combination of the two, where an asterisk means 'this dimension can be anything' and the numbers mean exactly what you would expect. Maybe something like that would be possible with Coalton, but I haven't played around a lot with that yet. | | |
| ▲ | chongli 3 days ago | parent [-] | | Dependent types can do this. This is information that can be known at compile time by means of a proof that the values are what they should be (or within certain bounds) through the entire call chain. |
|
|
|
|
| |
| ▲ | uecker 4 days ago | parent | prev | next [-] | | C has such types and can guarantee that there is no out-of-bounds access at run-time in the scenarios you describe: https://godbolt.org/z/f7Tz7EvfE
This is one reason why I think that C - despite all the naysayers - is actually perfectly positioned to address bounds-safe programming. Often in dependently-types languages one also tries to prove at compile-time that the dynamic index is inside the dynamic bound at run-time, but this depends. | | |
| ▲ | Sharlin 4 days ago | parent | next [-] | | -fsanitize-bounds uses a runtime address sanitizer, surely? The program compiles fine. In a (strongly) dependently typed language, something like the following would refuse to typecheck: int foo(int i) {
int bar[4] = { 1, 2, 3, 4 };
return bar[i]
}
The type checker would demand a proof that i is in bounds, for example int foo(int i) {
int bar[4] = { 1, 2, 3, 4 };
if i < 4
return bar[i]
else
return 0
}
In languages with an Option type this could of course be written without dependent types in a way that's still correct by construction, for example Rust: fn foo(i: 32) -> i32 {
let bar = [1, 2, 3, 4];
bar.get(i) // returns Option<i32>, not a raw i32
.unwrap_or(0) // provide a default, now we always have an i32
}
But ultimately, memory safety here is only guaranteed by the library, not by the type system. | | |
| ▲ | uecker 4 days ago | parent [-] | | You seem to be repeating what I said except you mixup strong and static typing. In a statically dependent-typed language you might expect it this not to compile, but this also depends. Run-time checking is certainly something you can combine with strong dependent typing. Implementing bounds checking that returns option types (which can also implement in C) is not exactly the same thing. But dependent typing can be more elegant - as it is here. | | |
| ▲ | uecker 4 days ago | parent [-] | | Update: For the interested, here are three ways to write this in C, the first using dependent type, the second using a span type, and the third using an option type. All three versions prevent out-of-bounds accesses: https://godbolt.org/z/nKTfhenoY |
|
| |
| ▲ | turndown 4 days ago | parent | prev [-] | | I thought that this was a GCC extension(you need to use #define n 10 instead of int n = 10). Is this not the case anymore? | | |
| ▲ | uecker 4 days ago | parent [-] | | This is in ISO C99. In C11 it was made an optional feature but in C23 we made the type part mandatory again. |
|
| |
| ▲ | thesz 4 days ago | parent | prev | next [-] | | "In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known." I remember being able to use telescopes [1] in Haskell long time ago, around 2012 or so. [1] https://www.pls-lab.org/en/telescope Haskell was not and is not properly dependently typed. | |
| ▲ | 4 days ago | parent | prev [-] | | [deleted] |
|