▲ | fleabitdev 4 days ago | |||||||||||||||||||||||||||||||
Happy to hear that they've introduced video encoders and decoders based on compute shaders. The only video codecs widely supported in hardware are H.264, H.265 and AV1, so cross-platform acceleration for other codecs will be very nice to have, even if it's less efficient than fixed-function hardware. The new ProRes encoder already looks useful for a project I'm working on. > Only codecs specifically designed for parallelised decoding can be implemented in such a way, with more mainstream codecs not being planned for support. It makes sense that most video codecs aren't amenable to compute shader decoding. You need tens of thousands of threads to keep a GPU busy, and you'll struggle to get that much parallelism when you have data dependencies between frames and between tiles in the same frame. I wonder whether encoders might have more flexibility than decoders. Using compute shaders to encode something like VP9 (https://blogs.gnome.org/rbultje/2016/12/13/overview-of-the-v...) would be an interesting challenge. | ||||||||||||||||||||||||||||||||
▲ | happymellon 4 days ago | parent | next [-] | |||||||||||||||||||||||||||||||
> Happy to hear that they've introduced video encoders and decoders based on compute shaders. This is great news. I remember being laughed at when I initially asked whether the Vulkan enc/dec were generic because at the time it was all just standardising interfaces for the in-silicon acceleration. Having these sorts of improvements available for legacy hardware is brilliant, and hopefully a first route that we can use to introduce new codecs and improve everyone's QOL. | ||||||||||||||||||||||||||||||||
▲ | gmueckl 4 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I haven't even had a cursory look at decoders state of the art for 10+ years. But my intuition would say that decoding for display could profit a lot from GPU acceleration for later parts of the process when there is already pixel data of some sort involved. Then I imagine thet the initial decompression steps could stay on the CPU and the decompressed, but still (partially) encoded data is streamed to the GPU for the final transformation steps and application to whatever I-frames and other base images there are. Steps like applying motion vectors, iDCT... look embarrassingly parallel at a pixel level to me. When the resulting frame is already in a GPU texture then, displaying it has fairly low overhead. My question is: how wrong am I? | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
▲ | mtillman 4 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
Exciting! I am consistently blown away by the talent of the ffmpeg maintainers. This is fairly hard stuff in my opinion and they do it for free. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
▲ | dtf 4 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
These release notes are very interesting! I spent a couple of weeks recently writing a ProRes decoder using WebGPU compute shaders, and it runs plenty fast enough (although I suspect Apple has some special hardware they make use of for their implementation). I can imagine this path also working well for the new Android APV codec, if it ever becomes popular. The ProRes bitstream spec was given to SMPTE [1], but I never managed to find any information on ProRes RAW, so it's exciting to see software and compute implementations here. Has this been reverse-engineered by the FFMPEG wizards? At first glance of the code, it does look fairly similar to the regular ProRes. [1] https://pub.smpte.org/doc/rdd36/20220909-pub/rdd36-2022.pdf | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
▲ | 4 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
[deleted] | ||||||||||||||||||||||||||||||||
▲ | mappu 3 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||
NVENC/NVDEC could do part of the processing on the shader cores instead of the fixed-function hardware. |