Remix.run Logo
Zero-Cost POSIX Compliance: Encoding the Socket State Machine in Lean's Types(ngrislain.github.io)
32 points by ngrislain 6 hours ago | 19 comments
e-dant 4 hours ago | parent | next [-]

This is cool stuff, but a nitpick: It’s not undefined behavior in the language sense in C to do socket ops on a bad file descriptor. It’s just an error from the kernel’s point of view, and the kernel will throw -errno at you.

singron 2 hours ago | parent [-]

Yes it's not UB, but the consequences are not limited to a EINVAL/EBADF/EBADFD. Calling close twice is essentially the same problem as calling free twice, so you get all the use-after-free problems on your file descriptors.

yjftsjthsd-h 5 hours ago | parent | prev | next [-]

> Lean 4 offers a fourth option: make the bug unrepresentable at the type level, then erase the proof at compile time so the generated code is identical to raw C.

Couldn't you do that in a more conventional type/class system without using an actual proof system? Instead of there being a Socket type/class, just make a Socket_Fresh, Socket_Bound, Socket_Listening, Socket_Connected, and maybe Socket_Closed (not 100% sure, would have to think about whether that's a thing or not), each of which takes the previous in its constructor. Or does that make it too hard to use?

hackyhacky 3 hours ago | parent | next [-]

That wouldn't work because there would be nothing stopping you from re-using a value representing an old state.

ridiculous_fish 2 hours ago | parent [-]

That's exactly what affine / linear types do.

paulddraper 4 hours ago | parent | prev [-]

The innovation is making that have zero runtime cost. (Though to be fair, I doubt the runtime cost is really significant...)

skavi 3 hours ago | parent | next [-]

Their suggestion is also zero runtime cost.

jibal 2 hours ago | parent | prev [-]

That's very odd response if you know what a type system is.

russdill 3 hours ago | parent | prev | next [-]

This is a based on such a surface level understanding of one type of posix socket. Calling close twice on a socket is a normal allowed thing, particularly for non blocking sockets. Datagram sockets can be operated with bind, without bind, with connect and bind and with both called multiple times.

comex 3 hours ago | parent [-]

Some of what you said is true, but you definitely can’t call close multiple times on the same file descriptor. close always immediately drops the file descriptor and isn’t like non-blocking socket operations that you have to try repeatedly until they succeed. You could, however, create multiple file descriptors pointing to the same socket with dup or other methods, in which case you’d need to close all of them to disconnect the socket.

russdill 3 hours ago | parent [-]

Bah, I was thinking of shutdown

diowldxiks 4 hours ago | parent | prev | next [-]

Does this work in the face of state changing out from under the socket? I'm not super familiar with low level socket details but I'm thinking something like connect returning EINPROGRESS and you not knowing if the connection has completed. It may complete, it may fail, but during that time this state machine is invalid I think. It seems like strict logical programming like this gets much harder in the face of mutable state changing out from under the program, but that can probably be worked around with enough effort.

12_throw_away 3 hours ago | parent | prev | next [-]

I'm like 3 sentences in and already things do not quite make sense.

> Calling [socket] operations in the wrong order [...] is undefined behaviour in C.

UB? For using a socket incorrectly? You sure about that?

> Documentation — trust the programmer to read the man page (C, Rust).

I'm sorry, are they saying that rust's socket interface is unsound? Looks to me like it's a pretty standard Rust-style safe interface [1], what am I missing?

[1] https://doc.rust-lang.org/std/net/struct.TcpListener.html

tom_ 3 hours ago | parent | next [-]

The C standard doesn't have anything to say about sockets at all, so not like it's defined to use them even in the right order.

WhyNotHugo 3 hours ago | parent | prev [-]

They say “undefined behaviour”. They mean “returns an error”, or “can return an error”.

strongly-typed 3 hours ago | parent | prev | next [-]

Feels like we're living in parallel universes. You're building Hale, and I'm building Abs XD : https://github.com/rdavison/abs

wk_end 4 hours ago | parent | prev | next [-]

Lean doesn’t have any kind of substructural typing, does it? At a glance it looks like you need to manually (lexically) rebind the socket at each step in the operation, and there’s nothing stopping you from holding onto a socket in a now-invalid state and making mess of things, right?

Also, boo AI slop. If you’re going to use AI to help write your technical blog posts please please please edit out all the “No X. No Y. Just pure Z.” marketing-speak.

tczMUFlmoNk 4 hours ago | parent [-]

This is what I was thinking, too. Without some kind of linearity, `connect` et al. don't give the claimed guarantees if you can just reuse the old socket handle. Especially if it's aliased in a list or something. I was surprised to see this not mentioned at all in the section specifically dedicated to double-close prevention.

Likewise, with implicit weakening, nothing stops you from dropping the socket without closing it.

khaledh 4 hours ago | parent | prev [-]

Interesting take on enforcing state machine rules using a proof system. I'm interested in this space, and have been developing a new programming language to enable typestate / state-machine representation at the type system level[0].

I don't know where it will end up on the spectrum of systems languages; it may end up being too niche or incomplete, but so far I think I'm scratching the right itch, at least for myself.

[0] https://github.com/khaledh/machina