As pointed out already, most of that uses C code or GPU code to do the work and not slow Python code.