Some of the examples:
* Itanium has register windows.
* Itanium has register rotations, so that you can modulo-schedule a loop.
* Itanium has so many registers that a context switch is going to involve spilling several KB of memory.
* The main registers have "Not-a-Thing" values to be able to handle things like speculative loads that would have trapped. Handling this for register spills (or context switches!) appears to be "fun."
* It's a bi-endian architecture.
* The way you pack instructions in the EPIC encoding is... fun.
* The rules of how you can execute instructions mean that you kind of have branch delay slots, but not really.
* There are four floating-point environments because why not.
* Also, Itanium is predicated.
* The hints, oh god the hints. It feels like every time someone came up with an idea for a hint that might be useful to the processor, it was thrown in there. How is a compiler supposed to be able to generate all of these hints?
* It's an architecture that's complicated enough that you need to handwrite assembly to get good performance, but the assembly has enough arcane rules that handwriting assembly is unnecessarily difficult.