Maybe I misunderstand the project but I feel it'd make sense to support some local inference, i.e using arbitrary ComfyUI workflows?
I dont think i am understanding your reply