| ▲ | zamadatix 11 hours ago |
| It's really only comparable to assembly level usage in the SIMD intrinsics style cases. Portable SIMD, like std::simd, is no more assembly level usage than calling math functions from the standard library. Usually one only bothers with the intrinsic level stuff for the use cases you're saying. E.g. video encoders/decoders needing hyper-optimized, per architecture loops for the heavy lifting where relying on the high level SIMD abstractions can leave cycles on the table over directly targeting specific architectures. If you're just processing a lot of data in bulk with no real time requirements, high level portable SIMD is usually more than good enough. |
|
| ▲ | taeric 11 hours ago | parent [-] |
| My understanding was that the difficulty with the intrinsics was more in how restrictive they are in what data they take in. That is, if you are trying to be very controlling of the SIMD instructions getting used, you have backed yourself into caring about the data that the CPU directly understands. To that end, even "calling math functions" is something that a surprising number of developers don't do. Certainly not with the standard high level data types that people often try to write their software into. No? |
| |
| ▲ | zamadatix 9 hours ago | parent [-] | | More than that: many of the intrinsics can be unsafe in standard Rust. This situation got much better this year but it's still not perfect. Portable SIMD has always been safe, because they are just normal high level interfaces. The other half is intrinsics are specific to the arch. Not only do you need to make sure the CPUs support the type of operation you want to do, but you need to redo all of the work to e.g. compile to ARM for newer MacBooks (even if they support similar operations). This is also not a problem using portable SIMD, the compiler will figure out how to map the lanes to each target architecture. The compiler will even take portable SIMD and compile it for a scalar target for you, so you don't have to maintain a SIMD vs non-SIMD path. By "calling math functions" I mean things like: let x = 5.0f64;
let result = x.sqrt()
Where most CPUs have a sqrt instruction but the program will automatically compile with a (good) software substitution for targets that don't. It's very similar with portable SIMD - the high level call gets mapped to whatever the target best supports automatically. Neither SIMD nor these kind of math functions work automatically with custom high level data types. The only way to play for those is to write the object to have custom methods which break it down to the basic types so the compiler knows what you want the complex type's behavior to be. If you can't code that then there isn't much you can do with the object, regardless of SIMD. With intrinsics you need to go a step further beyond all that and directly tell the compiler what specific CPU instructions should be used for each step (and make sure that is done safely, for the remaining unsafe operations). | | |
| ▲ | taeric 5 hours ago | parent [-] | | I knew what you meant. My point was more that most people are writing software at the level of "if (overlaps(a, b)) doSomething()" Yes, there will be plenty of math and intrinsics in the "overlaps" after you get through all of the accessors necessary to have the raw numbers. But especially in heavily modeled spaces, the number one killer of getting to the SIMD is that the data just isn't in a friendly layout for it. Is that not the case? |
|
|