> Java and JS […] both encode every codepoint outside the BMP as surrogate pairs in UTF-8
I can’t comment on Java, but JS I know reasonably well and I can’t think of any place it uses CESU-8.