I’d been thinking about if something like this would be possible for https://chatjimmy.ai/ . The underlying model is only llama 3 8B but I’m curious what coding harnesses would be like at 17k tok/s