Perhaps instead of writing an llm abstraction layer, you could use a lightweight one, such as @simonw's llm.