Using FMOD with voice chat utilizing OnAudioFilterRead()

I’m trying to integrate Normcore multiplayer SDK into our project. They offer low-latency voice chat by utilizing OnAudioFilterRead(), playing a dummy clip, and then use an audio effect to inject voice data. I’d like to use the FMOD snapshots in our project with the voice chat system - is this possible given the above info? Or is OnAudioFilterRead() strictly for use with the Unity audio engine and therefore not accessible with FMOD? Any guidance would be greatly appreciated. Thank you!

You should be able to grab data from OnAudioFilterRead and send it through FMOD using a programmer instrument, thus allowing you to take advantage of snapshots. The process would be something like this:

1 Like

@jeff_fmod Thank you very much for the reply! We’ll try this out and report back.

Hey Jeff, thanks for this, I was looking into this because I’m working on something similar but I’ve now got into the wall because these examples weren’t that clear to me tbh

        {

            uint lenbytes = (uint)(data.Length * channels * sizeof(float));
            Sound sound = new Sound();
            
            var result = sound.@lock(0, lenbytes, out var ptr1, out var ptr2, out var len1, out var len2);

            if (result != RESULT.OK)
            {
                Debug.LogError("Error! " + result);
                return;
            }

            var instance = FMODUnity.RuntimeManager.CreateInstance("event:/Music_Mixed/User_Voice");
            //What to actually pass in here, I have two potential IntPtrs and setUserData only accepts one 
            instance.setUserData()

Any help would be much appreciated.
Thanks!

If you find yourself needing to pass two values into setUserData, you can just create a new class that holds both values- this is what we do in the Timeline Example by creating the Timelineinfo class to hold both the bar and marker values.

Hey Jeff, thanks for the answer! Figued out that part but am having actually a hard time creating the sound object

  void OnAudioFilterRead(float[] data, int channels)
        {
            if(data == null || data.Length == 0)
                return;
            var result = ConvertArrayToFMODSound(data, channels, out var sound);
            if (result != RESULT.OK)
            {
                Debug.LogError($"Error creating the sound! Got this from the FMOD {result}");
                return;
            }
            //Optimize this with array of event instances pool?
            var programmerSound = RuntimeManager.CreateInstance("event:/Music_Mixed/User_Voice");
            GCHandle soundHandle = GCHandle.Alloc(sound, GCHandleType.Pinned);
            programmerSound.setUserData(GCHandle.ToIntPtr(soundHandle));
            programmerSound.setCallback(audioCallback);
            programmerSound.start();
            programmerSound.release();
        }



        private RESULT ConvertArrayToFMODSound(float[] data, int channels, out Sound sound)
        {
            //Tried with data.length * channels as well, but same results
            uint lenBytes = (uint)(data.Length * sizeof(float));

            CREATESOUNDEXINFO soundInfo = new CREATESOUNDEXINFO();
            soundInfo.length = lenBytes;
            soundInfo.format = SOUND_FORMAT.PCMFLOAT;
            soundInfo.numchannels = channels;
            //What should be set in here?
            //soundInfo.defaultfrequency = ??
            
            RuntimeManager.CoreSystem.createSound("", MODE.OPENUSER, ref soundInfo, out sound);

            IntPtr ptr1, ptr2;
            uint len1, len2;
            sound.@lock(0, lenBytes, out ptr1, out ptr2, out len1, out len2);
            Marshal.Copy(data, 0, ptr1, (int)(len1 / sizeof(float)));
            if (len2 > 0)
            {
                Marshal.Copy(data, (int)(len1 / sizeof(float)), ptr2, (int)(len2 / sizeof(float)));
            }
            sound.unlock(ptr1, ptr2, len1, len2);
            Debug.Log(sound.hasHandle());
            var result = sound.setMode(MODE.LOOP_NORMAL);
            
            return result;
        }
    


    [MonoPInvokeCallback(typeof(EVENT_CALLBACK))]
    static RESULT AudioEventCallback(EVENT_CALLBACK_TYPE type, IntPtr instancePtr, IntPtr parameterPtr)
    {
        EventInstance instance = new EventInstance(instancePtr);

        // Retrieve the user data
        instance.getUserData(out var pointer);

        // Get the string object
        GCHandle stringHandle = GCHandle.FromIntPtr(pointer);
        Sound sound = (Sound)stringHandle.Target;

        switch (type)
        {
            case EVENT_CALLBACK_TYPE.CREATE_PROGRAMMER_SOUND:
            {
                var parameter = (PROGRAMMER_SOUND_PROPERTIES)Marshal.PtrToStructure(parameterPtr, typeof(PROGRAMMER_SOUND_PROPERTIES));
                parameter.sound = sound.handle;
                parameter.subsoundIndex = -1;
                Marshal.StructureToPtr(parameter, parameterPtr, false);
                break;
            }
            case EVENT_CALLBACK_TYPE.DESTROY_PROGRAMMER_SOUND:
            {
                var parameter = (PROGRAMMER_SOUND_PROPERTIES)Marshal.PtrToStructure(parameterPtr, typeof(PROGRAMMER_SOUND_PROPERTIES));
                sound.release();
                sound = new Sound(parameter.sound);
                sound.release();
                break;
            }
            case EVENT_CALLBACK_TYPE.DESTROYED:
            {
                // Now the event has been destroyed, unpin the string memory so it can be garbage collected
                stringHandle.Free();
                break;
            }
        }
        return RESULT.OK;
    }
        
    }

I always get the INVALID_PARAM Error from the RESULT on the ConvertArrayToFMODSound. Any help would me much appreciated!

I think this is because you are missing the cbsize field which just needs to be set to the size of CREATESOUNDEXINFO object. e.g

soundInfo.cbsize = Marshal.SizeOf(typeof(FMOD.CREATESOUNDEXINFO));

This would usually be either 44100 to 48000, in any case it ought to match the sample rate your audio card is using. In Unity this is stored in AudioSettings.outputSampleRate.

I agree perhaps some instance pools or storing the audio from OnAudioFilterRead in a ring buffer and periodically firing off instances, but optimizations are for after it’s working!

Thank you so much for the answer, we’re getting there! :grinning_face_with_smiling_eyes:

ArgumentException: start_index + length > array length
            //Changed it from data.Length to systemSampleRate, still getting that exception
            uint lenBytes = (uint)(_systemSampleRate* channels * sizeof(float));

            CREATESOUNDEXINFO soundInfo = new CREATESOUNDEXINFO();
            soundInfo.length = lenBytes;
            soundInfo.format = SOUND_FORMAT.PCMFLOAT;
            soundInfo.numchannels = channels;
            soundInfo.defaultfrequency = _systemSampleRate;
            soundInfo.cbsize = Marshal.SizeOf(typeof(FMOD.CREATESOUNDEXINFO));
            
            RuntimeManager.CoreSystem.createSound("", MODE.OPENUSER, ref soundInfo, out sound);

            IntPtr ptr1, ptr2;
            uint len1, len2;
            sound.@lock(0, lenBytes, out ptr1, out ptr2, out len1, out len2);
           //Exception happens below
            Marshal.Copy(data, 0, ptr1, (int)(len1 / sizeof(float)));
            if (len2 > 0)
            {
                Marshal.Copy(data, (int)(len1 / sizeof(float)), ptr2, (int)(len2 / sizeof(float)));
            }

I have no idea what I’m doing wrong, if you could point it out it would be amazing. Thanks again!

You had it correctly before, should be

uint lenBytes = (uint)(data.Length * sizeof(float));

Other things to check:

  • Make sure _systemSampleRate has a non-zero value
  • Put an error check on the CoreSystem.createSound line and return early if it failed

Well combining Normcores functions with this did in fact made the sound go trough FMOD! So thank you so much for that!
So there is another issue which I think is a bit trickier here… spatial audio is completely gone when RealTimeAvatarVoice component is attached, that component is also responsible for the audio chat itself… I don’t see any reason why that would be mutually exclusive so I’m sending both of the scripts now, maybe you could see some clear indications what’s brekaing. When the script is not there, spatial audio works, but voice chat isn’t…

Greenlight 1 - Pastebin.com
This is audio output script
Greenlight 2 - Pastebin.com

Hey @jeff_fmod - Just to clarify what @kantagara is saying: he got the scripts working to the point where it seems like we’re getting input and output from the mics across app instances, but this is only working with ‘Disable Unity Audio’ UNchecked. When this is unchecked, it breaks all the spatial FMOD events and snapshots, and everything is 2D. But when it’s checked (as FMOD asks you to do in the setup process), the voicechat scripts do not work. We’re confused about how to integrate the voicechat without breaking the FMOD spatial/snapshot setup. Thanks for your continued help.

Leaving Unity Audio enabled would really only be a problem on Xbox, it looks like you are using Android/Oculus so it shouldn’t cause any issues leaving it enabled and it certainly shouldn’t be breaking FMOD’s spatialization.
When you say it’s breaking all spatilization, do you mean even events played on other objects in different locations are coming out 2D as well or is it just the programmer event / mic audio that you can’t get spatialization on?
In any case, I have not been able to reproduce the issue with spatialization on my end. To start with, in GreenLightAudioOutput on line 159:

programmerSound.set3DAttributes(position.To3DAttributes());

Can you please confirm that position.To3DAttributes() is returning the correct location of the gameObject and not just {0,0,0} or something like that?

@jeff_fmod I’m so sorry - I jumped in and made this more confusing. I’m wrong - it’s not the ‘Disable Unity Audio’ checkbox that’s breaking the spatialization. It’s only the script that @kantagara posted. We are using iOS.

@jeff_fmod actually the thing that breaks spatialization is the GreenLightRealTimeAvatarVoice, when we remove that component, spatialization works as expected, but that thing is needed in order for realtime voice chat to work properly.

To answer your question, set3d attribute uses appropriate values and it’s not vector3.zero because I’m caching position in update loop. Did debug logged it and it confirmed it as well

Thank you both for clarifying there- I believe I have reproduced the issue using the scripts you provided. Do you have listeners on your Player prefabs? If so, I have found that removing the listeners from new arrivals (guests?) has allowed spatialization to behave as expected. Perhaps you could make some kind of manager where you remove listeners from everything that isn’t your player?
If that doesn’t apply to your setup / doesn’t fix your issue I might need a repro to get a better understanding of how your scene is setup.

We don’t have more than 1 audio listener per the entire scene, it’s attached to one single object that’s not getting instantiated for every arriving client, and this in fact happens when there’s only just one player. Are you talking about the audio listener only, or some other type of listener?

@jeff_fmod any luck with reproing the issue?

I have reproduced the issue on iOS as you’ve described, I have found that it also reproduces with regular Unity Audio components (Audio Listener, Audio Source with 3D spatial blend) so this doesn’t seem to be an issue with FMOD.
On my device spatial audio exists for a second or so, ceases when the VR Player arrives, and returns if I leave the app and re-enter, is this the case for you too?

Hey @jeff_fmod thanks a bunch for getting that repro. Indeed the case is the same for us, we’ll get in touch with normcore so they can fix the issue.
And of course, everything you said for how to get the audio working properly is indeed working, with a minor modification where I had to divide the default frequency with the number of channels, then everything indeed started utilizing FMODs audio properly (we could hear the reverb and everything). Pasting the code here so some other fellow developer might find it useful :slight_smile:

                //audiodata is first processed via normcores audiooutputstream so we get the proper audio data back, that was also one of the issues why I didn't get the sound from the data[] array in onaudiofilterread
                var result = ConvertArrayToFMODSound(_audioData, channels, out var sound);
                if (result != RESULT.OK)
                {
                    Debug.LogError($"Error creating the sound! Got this from the FMOD {result}");
                    return;
                }

                var programmerSound = RuntimeManager.CreateInstance("event:/Music_Mixed/User_Voice");
                programmerSound.set3DAttributes(position.To3DAttributes());
                GCHandle soundHandle = GCHandle.Alloc(sound, GCHandleType.Pinned);
                programmerSound.setUserData(GCHandle.ToIntPtr(soundHandle));
                programmerSound.setCallback(audioCallback);
                programmerSound.start();
                programmerSound.release();    

 private RESULT ConvertArrayToFMODSound(float[] data, int channels, out Sound sound)
        {
            uint lenBytes = (uint)(data.Length * sizeof(float));
            

            CREATESOUNDEXINFO soundInfo = new CREATESOUNDEXINFO();
            soundInfo.length = lenBytes;
            soundInfo.format = SOUND_FORMAT.PCMFLOAT;
            soundInfo.numchannels = channels;
            soundInfo.defaultfrequency = _systemSampleRate / channels;
            soundInfo.cbsize = Marshal.SizeOf(typeof(FMOD.CREATESOUNDEXINFO));

            var res =RuntimeManager.CoreSystem.createSound("", MODE.OPENUSER, ref soundInfo, out sound);
            if (_systemSampleRate == 0)
            {
                Debug.LogError("Umm system sample rate is 0, that's odd");
                return RESULT.ERR_FORMAT;
            }
            if (res != RESULT.OK)
            {
                Debug.LogError($"Result is not valid. It should be OK but it's {res}");
                return res;
            }

            IntPtr ptr1, ptr2;
            uint len1, len2;
            sound.@lock(0, lenBytes, out ptr1, out ptr2, out len1, out len2);
            Marshal.Copy(data, 0, ptr1, (int)(len1 / sizeof(float)));
            if (len2 > 0)
            {
                Marshal.Copy(data, (int)(len1 / sizeof(float)), ptr2, (int)(len2 / sizeof(float)));
            }
            sound.unlock(ptr1, ptr2, len1, len2);
            var result = sound.setMode(MODE.LOOP_NORMAL);
            
            return result;
        }
    

1 Like

Hey @jeff_fmod - would you mind telling me how to reproduce your steps of testing this with Unity Audio? Thanks!