Check out Osaurus - MIT Licensed, native, Apple Silicon–only local LLM server - https://github.com/dinoki-ai/osaurus
thank you