On the surface, HDR Audio seems like an end-all, next-gen approach. It’s always cool to see new audio technologies and methodologies emerge, so I’d never want to say that there’s no need for HDR audio in the world. But it seems to run the risk of looking like a replacement for careful, deliberate mixing choices, and it somewhat disregards mixing as an art that requires human touch. It reminds me of myself years ago as an amateur composer, having trouble with making mixes sound professional, then discovering a website marketing a master bus compressor that will “take your music to the next level”.
Not having ever used HDR Audio, I’m no authority on it. But the idea of mixing like that sounds incredibly unintuitive and difficult. (And setting all your various assets/events to correct real-world dBs… I can’t even imagine monitoring a shotgun blast or explosion to set it to the correct SPL—what obscene levels are they setting their monitors to??? Are they wearing ear protection while mixing? This is probably my ignorance speaking—I’m sure this is not how they do it.) And as pointed out, users on that thread are headaching about how it usually leads to constantly finding workarounds and fighting against the system to get the mix right. So I guess, in the end, even with HDR audio in the world there’s still no easy one-step process to making a mix work. Opinion: You can achieve a better mix by artful and deliberate choices than by mechanically mixing the “real world” at a 200dB dynamic range and slapping a dynamic gain on the master bus.
The point I’m getting to is FMOD’s dynamic mixing capabilities are awesome, and it would be unfortunate if you missed out on them because you’re looking for something like HDR Audio. You can create a fantastic dynamic system with FMOD which you can have full control over.
I did a VR shooter a few years ago and used this approach: I had a series of bus “stems” (LOUDEST, LOUD, MEDIUM, SOFT, AMBIENCE, MUSIC, LFE) which I used as submixes. Other, more specific busses were routed into those submixes (weapons, enemy attacks, and taking damage were routed into LOUD; big explosions were routed into LOUDEST; footsteps into SOFT; interactive objects, weapon handling, enemy non-attack sounds, into MEDIUM, etc etc…). Almost all of my mixing was done on those submix busses, using snapshots. Basically it was a hierarchy, with LOUDEST taking priority over LOUD, and so on. But the nice thing is that I had control over each submix differently. What if I wanted to have a massive explosion go off right next to you, which would basically kill the ambience and other soft and medium sounds for a moment, but I still wanted to hear the enemies in my face who are attacking me? I might have the explosion event trigger a snapshot which would duck AMBIENCE and SOFT a bit, duck MEDIUM a lot, but not duck LOUD much at all. And I could further it by ducking busses within those submixes for more control. During the explosion I still want to hear the enemies in the foreground but I don’t want to convolute the space with player weapon blasts so I dip the WEAPON sub-bus too. Then we enter a quiet area and I want the ambience to pull up, along with footsteps and quiet interactive objects, so I use a mixer snapshot while I’m in that area that pulls only those busses up (not all busses or the master bus, lest someone set off a shotgun in this quiet zone, resulting in either deaf players or too jarring of a drop in total volume like a master bus limiter would do, making the ambience feel weird). Effectively, this is all doing what “HDR audio” does but selectively, with deliberation and more control.