Good point; I've decided to simply not support HTTP/1.1 pipelines, and to have a connection pooling layer for HTTP/2 instead that takes care of this.

In OxCaml, it has support for the effect system that we added in OCaml 5.0 onwards, which allows for a fiber to suspend itself and be restarted via a one-shot continuation. So it's possible to have a pipelined connection stash away a continuation for a response calculation and be woken up later on when it's ready.

All continuations have to be either discarded explicitly or resumed exactly once; this can lead to memory leaks in OCaml 5, but OxCaml has an emerging lifetime system that guarantees this is safe: https://oxcaml.org/documentation/parallelism/01-intro/ or https://gavinleroy.com/oxcaml-tutorial-icfp25/ for a taste of that. Beware though; it's cutting edge stuff and the interfaces are still emerging, but it's great fun if you don't mind some pretty hardcore ML typing ;-) When it all settles down it should be very ergonomic to use, but right now you do get some interesting type errors.

▲

int3trap 5 hours ago | parent [-]

> So it's possible to have a pipelined connection stash away a continuation for a response calculation and be woken up later on when it's ready.

Ahh, that's interesting. I think you still run into the issue where you have a case like this:

1. You get 10 pipelined requests from a single connection with a post body to update some record in a Postgres table.

2. All 10 requests are independent and can be resolved at the same time, so you should make use of Postgres pipelining and send them all as you receive them.

3. When finishing the requests, you likely need the information provided in the request object. Lets assume it's a lot of data in the body, to the point where you've reached you per connection buffer limit. You either allocate here to unblock the read, or you block new reads, impacting response latency, until all requests are completed. The allocation is the better choice at that point but that heuristic decision engine with the goal of peak performance is definitely nuanced, if not complicated.

Its a cool problem space though, so always interested in learning how others attack it.

	▲	avsm an hour ago \| parent [-]
		It is a cool problem space! What I'm doing is using a single buffer for body handling (since you dispatch that away and then reuse it for chunked encoding) so it never takes unbounded stack space. This might be a bit different in HTTP/3 where you can have multiple body transmissions multiplexing; I have to look into how this works (but it's UDP as well) What we never need to do in OxCaml is to keep a giant body buffer list in the stack; with effects, we can fork the stack any time, so the request object is shared naturally. The only way to free the stack is to return from a function, but you can have a tree of these that share values earlier in the callchain.