Yep, that's the point I was making - that choosing fixed 4-byte code-points doesn't significantly reduce the complexity of capturing everything that Unicode does.
Thanks for explaining!