That's what I noticed reading through this.

It looks like the server affinity is accomplished by using websockets. The http batching simply sends all the requests at once and then waits for the response.

I don't love this because it makes load balancing hard. If a bunch of chatty clients get a socket to the same server, now that server is burdened and potentially overloadable.

Further, it makes scaling in/out servers really annoying. Persistent long lived connections are beasts to deal with because now you have to handle that "what do I do if multiple requests are in flight?".

One more thing I don't really love about this, it requires a timely client. This seems like it might be trivial to DDOS as a client can simply send a stream of push events and never pull. The server would then be burdened to keep those responses around so long as the client remains connected. That seems bad.

▲

fleventynine 4 days ago | parent [-]

Yeah, I think to make a system using this really scale you'd have to add support for this protocol in your load balancer / DDOS defenses.

	▲	dgl 4 days ago \| parent [-]
		This isn't really that different to GWT, which Google has been scaling for a long time. My knowledge is a little outdated, however more complex applications had a "UI" server component which talked to multiple "API" backend components, doing internal load balancing between them. Architecturally I don't think it makes sense to support this in a load balancer, you instead want to pass back a "cost" or outright decisions to your load balancing layer. Also note the "batch-pipelining" example is just a node.js client; this already supports not just browsers as clients, so you could always add another layer of abstraction (the "fundamental theorem of software engineering").