He's also been hacking on a (closed source) LLM inference server since the GPT-2 days: https://bellard.org/ts_server/