| ▲ | kstrauser 4 hours ago | |||||||
All that’s true, but those factors affect all the audio similarly. The article specifically talks about server-side ad insertion, so it’s not like the case where it somehow uses the device’s .mov codec to play the content and an MP3 codec to play the ad. All ffmpeg (most likely) knows is that it’s decoding one long stream, and doesn’t switch audio pipelines mid-stream when it thinks it might be playing an ad at that moment. Regarding the perceptual volume differences: while true, that’s also a solvable problem. Output volumes can be calculated using standard curves. In any case, TV broadcasters have had to figure all this out years ago. | ||||||||
| ▲ | radley 2 hours ago | parent [-] | |||||||
> those factors affect all the audio similarly... Output volumes can be calculated using standard curves... TV broadcasters have had to figure all this out years ago. Sorry, but all of that is obtuse. The fact that some digital audio can be perceived as much louder than others –– yet it's all limited to the same digital range –– proves they aren't similar at all. There is no such thing as a standard curve for compression. Source levels vary almost infinitely. Accurately separating and reducing sound after the fact, without turning the whole thing to mud, is considered to be an impossible technical challenge. Next, TV broadcasters worked on a predetermined schedule with predetermined advertising. This gave them time to inspect and approve ads in advance. Streaming ads are generally served just in time from third-party services to the streaming host. FFMPEG gets the output from the stream host, but the host has to combine content together from multiple sources (entertainment + multiple ad servers) into that single stream. Currently, sound-level is completely at the whim of each ad server, as well as each ad producer. Meanwhile, the final output is at the whim of the streaming host: 24-hour-news streaming sites probably have different audio standards than Apple TV+. Ultimately, AI could potentially be used to solve it, since it can generate / make-up new sounds as part of reverse-compression. But it would still have to be done in advance by the third-party ad servers. | ||||||||
| ||||||||