HDR Audio in FMOD

Does FMOD have any functionality similar to Wwise’s HDR Audio feature?

It would be valuable for the game I am working on to be able to add a real-life volume parameter to all sounds (like 75dBA for a toilet flush, 150dBA for a shotgun shot), and then have the audio engine scale the playback’s dynamic range based on what sounds are playing.

If it’s not facilitated by native FMOD tools, are there any plugins that could help? Or any workflows that you know of that could be useful for achieving a similar result?

FMOD does not have an HDR system built in and it’s not really something that could be implemented via a plugin. Instead you would need to look into sidechaining and the compressor effect to control the mix. I see your post on Facebook: Game Audio Denizens is getting a lot of responses though, as discussed there HDR is a very bespoke solution and not always a good fit, which is why we haven’t invested time in developing one like AK have.

On the surface, HDR Audio seems like an end-all, next-gen approach. It’s always cool to see new audio technologies and methodologies emerge, so I’d never want to say that there’s no need for HDR audio in the world. But it seems to run the risk of looking like a replacement for careful, deliberate mixing choices, and it somewhat disregards mixing as an art that requires human touch. It reminds me of myself years ago as an amateur composer, having trouble with making mixes sound professional, then discovering a website marketing a master bus compressor that will “take your music to the next level”.

Not having ever used HDR Audio, I’m no authority on it. But the idea of mixing like that sounds incredibly unintuitive and difficult. (And setting all your various assets/events to correct real-world dBs… I can’t even imagine monitoring a shotgun blast or explosion to set it to the correct SPL—what obscene levels are they setting their monitors to??? Are they wearing ear protection while mixing? This is probably my ignorance speaking—I’m sure this is not how they do it.) And as pointed out, users on that thread are headaching about how it usually leads to constantly finding workarounds and fighting against the system to get the mix right. So I guess, in the end, even with HDR audio in the world there’s still no easy one-step process to making a mix work. Opinion: You can achieve a better mix by artful and deliberate choices than by mechanically mixing the “real world” at a 200dB dynamic range and slapping a dynamic gain on the master bus. :slightly_smiling_face:

The point I’m getting to is FMOD’s dynamic mixing capabilities are awesome, and it would be unfortunate if you missed out on them because you’re looking for something like HDR Audio. You can create a fantastic dynamic system with FMOD which you can have full control over.

I did a VR shooter a few years ago and used this approach: I had a series of bus “stems” (LOUDEST, LOUD, MEDIUM, SOFT, AMBIENCE, MUSIC, LFE) which I used as submixes. Other, more specific busses were routed into those submixes (weapons, enemy attacks, and taking damage were routed into LOUD; big explosions were routed into LOUDEST; footsteps into SOFT; interactive objects, weapon handling, enemy non-attack sounds, into MEDIUM, etc etc…). Almost all of my mixing was done on those submix busses, using snapshots. Basically it was a hierarchy, with LOUDEST taking priority over LOUD, and so on. But the nice thing is that I had control over each submix differently. What if I wanted to have a massive explosion go off right next to you, which would basically kill the ambience and other soft and medium sounds for a moment, but I still wanted to hear the enemies in my face who are attacking me? I might have the explosion event trigger a snapshot which would duck AMBIENCE and SOFT a bit, duck MEDIUM a lot, but not duck LOUD much at all. And I could further it by ducking busses within those submixes for more control. During the explosion I still want to hear the enemies in the foreground but I don’t want to convolute the space with player weapon blasts so I dip the WEAPON sub-bus too. Then we enter a quiet area and I want the ambience to pull up, along with footsteps and quiet interactive objects, so I use a mixer snapshot while I’m in that area that pulls only those busses up (not all busses or the master bus, lest someone set off a shotgun in this quiet zone, resulting in either deaf players or too jarring of a drop in total volume like a master bus limiter would do, making the ambience feel weird). Effectively, this is all doing what “HDR audio” does but selectively, with deliberation and more control.

2 Likes

I appreciate your reply. It’s nice to have a definitive answer so I’m not chasing things that don’t exist. Totally makes sense why it wouldn’t be a priority for the FMOD team. Also, maybe seems somewhat antithetical to the design ethos of FMOD, which I perceive as fairly stream-lined and accessible.

I hope perhaps this conversation is searchable too. My motivation for reaching out on this subject is it is fairly challenging to search for information about so-called “HDR Audio”. Searching stuff like “FMOD HDR” gets you games that support HDR video and use the FMOD middleware.

I’m certainly motivated to stick with FMOD for its accessibility for indie developers. My current project could benefit from the HDR audio workflow, despite its downsides. It does hold benefits for realistic simulations or large open-world games that have such unpredictable audio interactions that you might want to deal with it in a programmatic way rather than trying to mix for every scenario.

The Game Audio Denizens post has been very helpful for a window into the wider use of HDR Audio-- or maybe it would be more appropriate to call it something like ‘Variable Dynamic Range’ Audio. It is also interesting to see the disdain that some have for the approach. I think for many they see it as limiting their agency in controlling the mix. Which is fair. I think it is more enticing if you are approaching it from the perspective of a Technical Audio Designer, and you may have more comfort with approaching mixing by algorithm.

No one was talking about FMOD in that post though, so I also wanted to bring the discussion here in case there were FMOD-specific insights.

The game audio equivalent of LANDR is a horror I’d rather not imagine. I agree that programmatic approaches to mixing (or mastering) are generally problematic and fail to deliver on their promise. I think though, for a technically-minded sound designer, the concept of HDR audio could be another tool for them to make a multitude of deliberate mixing choices in an algorithmic way. But like you said, you’re not dismissing it, just acknowledging that it treads on a historically problematic path.

The way I imagine using it is tagging all assets with the approximate dBA volume (positive value) it would be at the emission source. Let’s call it the “emission volume”. One could measure this at the time of recording samples with a dB meter, but there are plenty of charts online that give you approximate values for common items that should be close enough. This dBA value is just a parameter tagged to the asset, not related to the gain setting of the file. The file itself would be normalized in whatever way seems most useful… probably loudness normalization.

The audio engine would evaluate the samples that are currently playing back, also considering their distance from the listener to scale their emission volume parameter accordingly. It essentially gives a detailed hierarchy to the various sounds, with some semblance of relative volume differences between them. This information lets the audio engine decide what the dynamic range of playback will be without having to use audio effects like compression or dynamic gain. You could then have a playback range that configures the loudest sounds to play at your maximum threshold (say -10dB) and the softest sounds to play at your minimum (say -60dB) and culls sounds with a emission volume parameter that is too low relative to the loudest sound (maybe you configure it to cull sounds with a emission parameter - distance adjustment that is less than half of the loudest sounds emission parameter - distance).

So you wouldn’t need ear protection while mixing. :wink: The “real-world” values are just a parameter that establishes priority in a very granular way, you don’t actually playback at the actual real-world volume level.

I don’t think the objective is an easy one-step process to making a mix work. Just typing this out was difficult-- and this is a simplified version. I think the idea is to have this crazy granular hierarchy system that might be more performant than using side-chains but also might have a better result than simple ducking.

I’ve only recently become aware that there are systems like this, so I can’t speak for their implementations. But I don’t see any reason this couldn’t be designed to work alongside classic mixing techniques if you set it up right. You could make all diagetic sounds use the emission volume parameter, but non-digetic sounds (HUD, menu, powerups?, music) would ignore that system.

I’m not going to abandon FMOD over it, and I’m going to make full use of its dynamic mixing capabilities. Just want to make sure I’m not missing something because of a knowledge gap. I’m only starting to get an idea of where the industry is at with this HDR Audio concept.

Also I appreciate the insight into your mixing technique. Lots of thoughtful ideas there to consider.

While researching something else, I stumbled upon another thread about HDR audio. Linking here for posterity:

1 Like