Remix.run Logo
TimorousBestie 6 hours ago

I don’t know much about raytracing but it’s probably tricky to orchestrate all those asin calls so that the input and output memory is aligned and contiguous. My uneducated intuition is that there’s little regularity as to which pixels will take which branches and will end up requiring which asin calls, but I might be wrong.

scottlamb 6 hours ago | parent [-]

I'd expect it to come down to data-oriented design: SoA (structure of arrays) rather than AoS (array of structures).

I skimmed the author's source code, and this is where I'd start: https://github.com/define-private-public/PSRayTracing/blob/8...

Instead of an `_objects`, I might try for a `_spheres`, `_boxes`, etc. (Or just `_lists` still using the virtual dispatch but for each list, rather than each object.) The `asin` seems to be used just for spheres. Within my `Spheres::closest_hit` (note plural), I'd work to SIMDify it. (I'd try to SIMDify the others too of course but apparently not with `asin`.) I think it's doable: https://github.com/define-private-public/PSRayTracing/blob/8...

I don't know much about ray tracers either (having only written a super-naive one back in college) but this is the general technique used to speed up games, I believe. Besides enabling SIMD, it's more cache-efficient and minimizes dispatch overhead.

edit: there's also stuff that you can hoist in this impl. Restructuring as SoA isn't strictly necessary to do that, but it might make it more obvious and natural. As an example, this `ray_dir.length_squared()` is the same for the whole list. You'd notice that when iterating over the spheres. https://github.com/define-private-public/PSRayTracing/blob/8...

def-pri-pub 3 hours ago | parent | next [-]

When I was working on this project, I was trying to restrict myself to the architecture of the original Ray Tracing in One Weekend book series. I am aware that things are not as SIMD friendly and that becomes a major bottle neck. While I am confident that an architectural change could yield a massive performance boost, it's something I don't want to spend my time on.

I think it's also more fun sometimes to take existing systems and to try to optimize them given whatever constraints exist. I've had to do that a lot in my day job already.

scottlamb an hour ago | parent [-]

I can relate to setting an arbitrary challenge for myself. fwiw, don't know where you draw the line of an architectural change, but I think that switching AoS -> SoA may actually be an approachably-sized mechanical refactor, and then taking advantage of it to SIMDify object lists can be done incrementally.

The value of course is contingent on there being a decent number of objects of a given type in the list rather than just a huge number of rays being sent to a small number of objects; I didn't evaluate that. If it's the other way around, the structure would be better flipped, and I don't know how reasonable that is with bounces (that maybe then aren't all being evaluated against the same objects?).

TimorousBestie 6 hours ago | parent | prev [-]

This tracks with my experience and seems reasonable, yes. I tend to SoA all the things, sometimes to my coworkers’ amusement/annoyance.