Indeed, LaTeXML (the software used by arXiv) converts LaTeX to a semantic XML document which is turned to HTML using primarily XSLT!