Taalas seems to be pretty good. This is their demo: https://chatjimmy.ai/
This might be even more limited though. They can't physically fit a large model on a single chip.
Not yet. But it is definitely limited, since they can only serve a single model essentially.