Here is a reference https://www.sharpai.org/benchmark/ For specific tasks, local model could achieve workable level.