I imagine you have to start decoding many speculative completions in parallel to have true low latency.