| ▲ | ACCount37 3 hours ago | |
And? Even if I believed this to be a limitation, I could bolt an adapter to an LLM to make it input and output non-text data. That's how a lot of bleeding edge multimodals work already. They can take and emit images, sound, actions and more. | ||