Latency with FMOD Record/Programmer instrument

Hi there,

I am making a game where I take the player’s mic input, send it to a programmer instrument for real time processing, then play it. I have had a consistent 0.5 second latency. I took out the programmer instrument portion, and the latency remains so I figure the latency probably ocurrs during the recording part. As someone who is not a programmer, looking through forums of people discussing similar issues and Core API resource page is only helpful to a limit. Would love to have some help!

Unity: 2022.3.441f1
FMOD Studio 2.02.21
Computer: MacBook Pro (2020, M1, 16GB Memory), Big Sur 11.7.2 >Yes I am aware Mac is not the best option for game development and am thinking about getting PC

Here’s the codes for FMOD Recording/Programmer Instrument (it’s a merge of scripts )

using System;
using System.Collections.Generic;
using UnityEngine;
using System.Runtime.InteropServices;
using FMOD.Studio;

public class ProgrammingInstrument : MonoBehaviour
{


    

    /// <summary>
    /// Recording into a FMOD.Sound, then assign it to a progammer instrument
    /// </summary>

    FMOD.Studio.EVENT_CALLBACK dialogueCallback;
    public FMODUnity.EventReference EventName;

#if UNITY_EDITOR
    void Reset()
    {
        EventName = FMODUnity.EventReference.Find("event:/Record");
    }
#endif

    //Recording varaibles

    private uint samplesRecorded, samplesPlayed = 0;
    private int nativeRate, nativeChannels = 0;
    private uint recSoundLength = 0;

    //FMOD Sound variables
    private FMOD.CREATESOUNDEXINFO exInfo = new FMOD.CREATESOUNDEXINFO();
    public FMOD.Sound recSound;
    private FMOD.Channel channel;

    //Programmer instrument
    public IntPtr TempInstPtr;
    public IntPtr TempParaPtr;

    void Start()
    {
        //recording
        /*
            Determine latency in samples.
        */
        FMODUnity.RuntimeManager.CoreSystem.getRecordDriverInfo(0, out _, 0, out _, out nativeRate, out _, out nativeChannels, out _);

        /*
            Create user sound to record into, then start recording.
        */
        exInfo.cbsize = Marshal.SizeOf(typeof(FMOD.CREATESOUNDEXINFO));
        exInfo.numchannels = nativeChannels;
        exInfo.format = FMOD.SOUND_FORMAT.PCM16;
        exInfo.defaultfrequency = nativeRate;
        exInfo.length = (uint)(nativeRate * sizeof(short) * nativeChannels);

        //Create recSound as an FMOD Sound and record into it 
        FMODUnity.RuntimeManager.CoreSystem.createSound("Record", FMOD.MODE.LOOP_NORMAL | FMOD.MODE.OPENUSER, ref exInfo, out recSound);
        FMODUnity.RuntimeManager.CoreSystem.recordStart(0, recSound, true);
        recSound.getLength(out recSoundLength, FMOD.TIMEUNIT.PCM);

        //FMODUnity.RuntimeManager.CoreSystem.getMasterChannelGroup(out FMOD.ChannelGroup mCG);
        //FMODUnity.RuntimeManager.CoreSystem.playSound(recSound, mCG, false, out channel);

        /// <summary>
        /// Create switch for programmer instrument
        /// </summary>


        // Explicitly create the delegate object and assign it to a member so it doesn't get freed
        // by the garbage collected while it's being used
        dialogueCallback = new FMOD.Studio.EVENT_CALLBACK(DialogueEventCallback);
        TempParaPtr = System.IntPtr.Zero;


        [AOT.MonoPInvokeCallback(typeof(FMOD.Studio.EVENT_CALLBACK))]
        FMOD.RESULT DialogueEventCallback(FMOD.Studio.EVENT_CALLBACK_TYPE type, IntPtr instancePtr, IntPtr parameterPtr)
        {
            FMOD.Studio.EventInstance instance = new FMOD.Studio.EventInstance(instancePtr);

            // Retrieve the user data
            IntPtr stringPtr;
            instance.getUserData(out stringPtr);

            // Get the string object
            GCHandle stringHandle = GCHandle.FromIntPtr(stringPtr);
            String key = stringHandle.Target as String;

            TempInstPtr = instancePtr;
            TempParaPtr = parameterPtr;

            switch (type)
            {
                case FMOD.Studio.EVENT_CALLBACK_TYPE.CREATE_PROGRAMMER_SOUND:
                    {
                        var parameter = (FMOD.Studio.PROGRAMMER_SOUND_PROPERTIES)Marshal.PtrToStructure(parameterPtr, typeof(FMOD.Studio.PROGRAMMER_SOUND_PROPERTIES));
                      
                        // We now need to pass the Sound to the programmer instrument
                        parameter.sound = recSound.handle;
                        Marshal.StructureToPtr(parameter, parameterPtr, false);
                        Debug.Log("Programmer Instrument Created");
                        break;
                    }
                case FMOD.Studio.EVENT_CALLBACK_TYPE.DESTROY_PROGRAMMER_SOUND:
                    {
                        var parameter = (FMOD.Studio.PROGRAMMER_SOUND_PROPERTIES)Marshal.PtrToStructure(parameterPtr, typeof(FMOD.Studio.PROGRAMMER_SOUND_PROPERTIES));
                        var sound = new FMOD.Sound(parameter.sound);
                        sound.release();
                        Debug.Log("Destroy Programme Instruemnt");
                        break;
                    }
                case FMOD.Studio.EVENT_CALLBACK_TYPE.DESTROYED:
                    {
                        // Now the event has been destroyed, unpin the string memory so it can be garbage collected
                        stringHandle.Free();
                        Debug.Log("destroyed");
                        break;
                    }
            }
            return FMOD.RESULT.OK;
        }
    }

    void PlayDialogue()
    {
        //create event instance to play
        var dialogueInstance = FMODUnity.RuntimeManager.CreateInstance(EventName);
  
        // Pin the key string in memory and pass a pointer through the user data
        GCHandle stringHandle2 = GCHandle.Alloc(recSound);
        IntPtr pointer = GCHandle.ToIntPtr(stringHandle2);
        dialogueInstance.setUserData(pointer);
        dialogueInstance.setCallback(dialogueCallback);
        dialogueInstance.start();
        dialogueInstance.release();
    }

    void Update()
    {
        if (Input.GetKeyDown(KeyCode.P))
        {
            PlayDialogue();
        }
    }
}

Hi, panpan

Thank you for sharing the information and code for us to test.

Based on your description and the fact that the latency persists even without the programmer instrument, it’s very likely that the issue lies in the recording setup itself.

I took a look at your script and noticed a few things that might help you to ease the latency:

1. Move the DialogueEventCallbac out from the Start() and make it static

We need to move the callback function outside of the Start() section so that it can work properly throughout the game. This way, FMOD can use it whenever it needs to play sound during the game. Also, we’ll make the function ‘static’ which makes it always available to FMOD while the game is running…

2. Adding Drift Compensation to the Update()

To reduce latency, we need to add some logic to keep adjusting the playback speed slightly during the game, based on how far behind or ahead it is from the recording. This adjustment happens in the Update() part of the code. You could think of this as continuously fine-tuning the timing to make sure everything stays synced.

For more information, you could have a look at our Record example

I have also modified your script and I will share it below for you as a reference:

using System;
using System.Collections.Generic;
using UnityEngine;
using System.Runtime.InteropServices;
using FMOD.Studio;

public class ProgrammingInstrument : MonoBehaviour
{
    /// <summary>
    /// Recording into a FMOD.Sound, then assign it to a progammer instrument
    /// </summary>

    FMOD.Studio.EVENT_CALLBACK dialogueCallback;
    public FMODUnity.EventReference EventName;

#if UNITY_EDITOR
    void Reset()
    {
        EventName = FMODUnity.EventReference.Find("event:/Record");
    }
#endif

    //Recording varaibles

    private uint samplesRecorded, samplesPlayed = 0;
    private int nativeRate, nativeChannels = 0;
    private uint recSoundLength = 0;

    //FMOD Sound variables
    private FMOD.CREATESOUNDEXINFO exInfo = new FMOD.CREATESOUNDEXINFO();
    public FMOD.Sound recSound;
    private FMOD.Channel channel;

    //Programmer instrument
    public IntPtr TempInstPtr;
    public IntPtr TempParaPtr;

    //RecordSetting
    private uint LATENCY_MS = 50;
    private uint DRIFT_MS = 1;
    uint lastPlayPos = 0;
    uint lastRecordPos = 0;
    private uint driftThreshold = 0;
    private uint desiredLatency = 0;
    private uint adjustLatency = 0;
    private int actualLatency = 0;

    void Start()
    {
        //recording
        /*
            Determine latency in samples.
        */
        FMODUnity.RuntimeManager.CoreSystem.getRecordDriverInfo(0, out _, 0, out _, out nativeRate, out _, out nativeChannels, out _);
        driftThreshold = (uint)(nativeRate * DRIFT_MS) / 1000;
        desiredLatency = (uint)(nativeRate * LATENCY_MS) / 1000;
        adjustLatency = desiredLatency;
        actualLatency = (int)desiredLatency;

        /*
            Create user sound to record into, then start recording.
        */
        exInfo.cbsize = Marshal.SizeOf(typeof(FMOD.CREATESOUNDEXINFO));
        exInfo.numchannels = nativeChannels;
        exInfo.format = FMOD.SOUND_FORMAT.PCM16;
        exInfo.defaultfrequency = nativeRate;
        exInfo.length = (uint)(nativeRate * sizeof(short) * nativeChannels);

        //Create recSound as an FMOD Sound and record into it 
        FMODUnity.RuntimeManager.CoreSystem.createSound("Record", FMOD.MODE.LOOP_NORMAL | FMOD.MODE.OPENUSER, ref exInfo, out recSound);
        FMODUnity.RuntimeManager.CoreSystem.recordStart(0, recSound, true);
        recSound.getLength(out recSoundLength, FMOD.TIMEUNIT.PCM);

        //FMODUnity.RuntimeManager.CoreSystem.getMasterChannelGroup(out FMOD.ChannelGroup mCG);
        //FMODUnity.RuntimeManager.CoreSystem.playSound(recSound, mCG, false, out channel);

        /// <summary>
        /// Create switch for programmer instrument
        /// </summary>


        // Explicitly create the delegate object and assign it to a member so it doesn't get freed
        // by the garbage collected while it's being used
        dialogueCallback = new FMOD.Studio.EVENT_CALLBACK(DialogueEventCallback);
        TempParaPtr = System.IntPtr.Zero;
    }

    [AOT.MonoPInvokeCallback(typeof(FMOD.Studio.EVENT_CALLBACK))]
    static FMOD.RESULT DialogueEventCallback(FMOD.Studio.EVENT_CALLBACK_TYPE type, IntPtr instancePtr, IntPtr parameterPtr)
    {
        FMOD.Studio.EventInstance instance = new FMOD.Studio.EventInstance(instancePtr);

        // Retrieve the user data
        IntPtr stringPtr;
        instance.getUserData(out stringPtr);

        // Get the string object
        GCHandle stringHandle = GCHandle.FromIntPtr(stringPtr);
        String key = stringHandle.Target as String;

        //TempInstPtr = instancePtr;
        //TempParaPtr = parameterPtr;
        FMOD.Sound programmerSound = (FMOD.Sound)stringHandle.Target;

        switch (type)
        {
            case FMOD.Studio.EVENT_CALLBACK_TYPE.CREATE_PROGRAMMER_SOUND:
                {
                    var parameter = (FMOD.Studio.PROGRAMMER_SOUND_PROPERTIES)Marshal.PtrToStructure(parameterPtr, typeof(FMOD.Studio.PROGRAMMER_SOUND_PROPERTIES));

                    // We now need to pass the Sound to the programmer instrument
                    parameter.sound = programmerSound.handle;
                    Marshal.StructureToPtr(parameter, parameterPtr, false);
                    Debug.Log("Programmer Instrument Created");
                    break;
                }
            case FMOD.Studio.EVENT_CALLBACK_TYPE.DESTROY_PROGRAMMER_SOUND:
                {
                    var parameter = (FMOD.Studio.PROGRAMMER_SOUND_PROPERTIES)Marshal.PtrToStructure(parameterPtr, typeof(FMOD.Studio.PROGRAMMER_SOUND_PROPERTIES));
                    var sound = new FMOD.Sound(parameter.sound);
                    sound.release();
                    Debug.Log("Destroy Programme Instruemnt");
                    break;
                }
            case FMOD.Studio.EVENT_CALLBACK_TYPE.DESTROYED:
                {
                    // Now the event has been destroyed, unpin the string memory so it can be garbage collected
                    stringHandle.Free();
                    Debug.Log("destroyed");
                    break;
                }
        }
        return FMOD.RESULT.OK;
    }

    void PlayDialogue()
    {
        //create event instance to play
        var dialogueInstance = FMODUnity.RuntimeManager.CreateInstance(EventName);

        // Pin the key string in memory and pass a po`inter through the user data
        GCHandle stringHandle2 = GCHandle.Alloc(recSound);
        IntPtr pointer = GCHandle.ToIntPtr(stringHandle2);
        dialogueInstance.setUserData(pointer);
        dialogueInstance.setCallback(dialogueCallback);
        dialogueInstance.start();
        dialogueInstance.release();
    }

    void Update()
    {
        if (Input.GetKeyDown(KeyCode.P))
        {
            PlayDialogue();
        }

        /*
            Determine how much has been recorded since we last checked
        */
        uint recordPos = 0;
        FMODUnity.RuntimeManager.CoreSystem.getRecordPosition(0, out recordPos);

        uint recordDelta = (recordPos >= lastRecordPos) ? (recordPos - lastRecordPos) : (recordPos + recSoundLength - lastRecordPos);
        lastRecordPos = recordPos;
        samplesRecorded += recordDelta;

        uint minRecordDelta = 0;
        if (recordDelta != 0 && (recordDelta < minRecordDelta))
        {
            minRecordDelta = recordDelta; // Smallest driver granularity seen so far
            adjustLatency = (recordDelta <= desiredLatency) ? desiredLatency : recordDelta; // Adjust our latency if driver granularity is high
        }

        /*
            Delay playback until our desired latency is reached.
        */
        if (!channel.hasHandle() && samplesRecorded >= adjustLatency)
        {
            FMODUnity.RuntimeManager.CoreSystem.getMasterChannelGroup(out FMOD.ChannelGroup mCG);
            FMODUnity.RuntimeManager.CoreSystem.playSound(recSound, mCG, false, out channel);
        }

        /*
            Determine how much has been played since we last checked.
        */
        if (channel.hasHandle())
        {
            uint playPos = 0;
            channel.getPosition(out playPos, FMOD.TIMEUNIT.PCM);

            uint playDelta = (playPos >= lastPlayPos) ? (playPos - lastPlayPos) : (playPos + recSoundLength - lastPlayPos);
            lastPlayPos = playPos;
            samplesPlayed += playDelta;

            // Compensate for any drift.
            int latency = (int)(samplesRecorded - samplesPlayed);
            actualLatency = (int)((0.97f * actualLatency) + (0.03f * latency));

            int playbackRate = nativeRate;
            if (actualLatency < (int)(adjustLatency - driftThreshold))
            {
                // Playback position is catching up to the record position, slow playback down by 2%
                playbackRate = nativeRate - (nativeRate / 2);
            }

            else if (actualLatency > (int)(adjustLatency + driftThreshold))
            {
                // Playback is falling behind the record position, speed playback up by 2%
                playbackRate = nativeRate + (nativeRate / 2);
            }

            channel.setFrequency((float)playbackRate);
        }
    }
}

Let me know if the issue persists, please don’t hesitate to ask if you have any questions.

Hello! Thank you so much for the response. We have found a solution to help with the latency but will for sure test out your response to make it even more stable!

The solution we found was the length of the FMOD Sound created, which is this code below:

exInfo.length = (uint)(nativeRate * sizeof(short) * nativeChannels);

We reckoned it might be a circular buffer, and the length was set to be too long, which delays the playback hence the latency. The latency was significantly better after shortening the length.

Thank you again for your help! Will report back if any question should arise.

1 Like

Glad to hear you found a solution and thank you for sharing the solution here!

Yes, exInfo.length represents the size of the circular buffer used by FMOD for real-time audio recording. By reducing this length, you’re making the buffer smaller, which in turn reduces the time it takes for the audio to fill the buffer and play back, hence lowering the latency.