I wonder if it's possible to train to read text encoded as one colored pixel per letter, or even per token.
Given how people can learn languages, absolutely yes.