FMOD Voice Recording access to real-time playable bytes/data (C++)

CreativePS · November 15, 2024, 5:52pm

Hello,

I’ve been using FMOD recently and have already looked at the record.cpp example.

However, in order to capture the audio data, it advises to use FMOD::lock and FMOD::unlock in sort of an external way, even though recordStart is non-blocking.

How exactly would I fetch the bytes in a synchronized way, so that there is no popping or artifacts? I’m aware that FMOD is internally using circular buffers as well. Is there no callback I could use? I’ve tried FMOD::System::CreateStream and tried to point recordStart to there and just use PCMReadCallback, which then complains about:

“[ERR] SystemI::recordStart : Invalid sound, must be an FMOD::Sound with positive length created as FMOD_CREATESAMPLE.”, even when I do fulfill everything that FMOD complains about.

Could anyone help? Also, I’m using C++ to do this.

Thank you!

li_fmod · November 20, 2024, 5:26am

Hi,

Thank you for sharing the information.

It seems that lock and unlock are unable to access the audio buffer, you could consider using a custom DSP to capture the audio buffer directly and add it to your recording channel by using ChannelControl::addDSP.

I will share an example script below as a reference:

FMOD_DSP_DESCRIPTION tap{};
tap.pluginsdkversion = FMOD_PLUGIN_SDK_VERSION;
tap.numinputbuffers = 1;    
tap.numoutputbuffers = 1;   
tap.numparameters = 0;  

tap.process = [](FMOD_DSP_STATE *dsp_state, unsigned int numsamples, const    FMOD_DSP_BUFFER_ARRAY *inbufferarray,
                 FMOD_DSP_BUFFER_ARRAY *outbufferarray, FMOD_BOOL inputsidle, FMOD_DSP_PROCESS_OPERATION op) -> FMOD_RESULT {
    if (op == FMOD_DSP_PROCESS_PERFORM) {
        if (inputsidle) {
            // If input is idle, optionally clear the output buffer to silence
            memset(outbufferarray->buffers[0], 0, numsamples * sizeof(float));
            return FMOD_OK;
        }

        // Process audio samples
        for (unsigned int i = 0; i < numsamples; ++i) {
            outbufferarray->buffers[0][i] = inbufferarray->buffers[0][i] * 0.5f; // Reduce amplitude to prevent clipping
        }
    }

    return FMOD_OK;
};

    FMOD::DSP *myDSP;
    result = system->createDSP(&tap, &myDSP);
    ERRCHECK(result);

    result = myRecordingChannel->addDSP(0, myDSP);  // Add DSP to the channel
    ERRCHECK(result);

Unfortunately, FMOD does not support using PCMReadCallback for recording since recordStart requires a sound object created with FMOD_CREATESAMPLE, which is incompatible with creatStream that is meant for streaming playback.

Hope this helps, let me know if you have any questions.

CreativePS · November 20, 2024, 11:49am

li_fmod:

FMOD_DSP_DESCRIPTION tap{};
tap.pluginsdkversion = FMOD_PLUGIN_SDK_VERSION;
tap.numinputbuffers = 1;    
tap.numoutputbuffers = 1;   
tap.numparameters = 0;  

tap.process = [](FMOD_DSP_STATE *dsp_state, unsigned int numsamples, const    FMOD_DSP_BUFFER_ARRAY *inbufferarray,
                 FMOD_DSP_BUFFER_ARRAY *outbufferarray, FMOD_BOOL inputsidle, FMOD_DSP_PROCESS_OPERATION op) -> FMOD_RESULT {
    if (op == FMOD_DSP_PROCESS_PERFORM) {
        if (inputsidle) {
            // If input is idle, optionally clear the output buffer to silence
            memset(outbufferarray->buffers[0], 0, numsamples * sizeof(float));
            return FMOD_OK;
        }

        // Process audio samples
        for (unsigned int i = 0; i < numsamples; ++i) {
            outbufferarray->buffers[0][i] = inbufferarray->buffers[0][i] * 0.5f; // Reduce amplitude to prevent clipping
        }
    }

Hi, what I essentially want is to just capture a real-time buffer in a way that I can just use sendto() on it using a simple UDP socket and just fire-and-forget it so that the audio (my voice) can be replayed in real-time on the other side, allowing for a simple voice cha functionality with FMOD.

CreativePS · November 20, 2024, 11:50am

Oh, and thank you for your response sir.

li_fmod · November 22, 2024, 12:09am

Unfortunately, FMOD does not provide built-in networking support, so you’ll need to handle the transmission yourself.

You could consider packaging audio samples inbufferarray->buffers[0][i] into a custom structure like this:

    struct Packet
    {
        char data[32];
        int datalen;
    };

Then, use your own networking API (e.g., UDP sockets) to send the data over the network. On the receiver end, unpack the data and play it back using FMOD.

If you are looking for more information, there was a discussion relate to this topic that is worth reading through:

FmodUserStd · December 22, 2024, 11:13pm

Could you exactly make an example of how to unpack the data and play it back using FMOD? Do I have to do it through the DSP again? Otherwise, I have everything else already working and good to go.

li_fmod · December 29, 2024, 10:11pm

Hi, sorry for the late response.

I’m not fully familiar with the specific details of your implementation, so I can provide a high-level explanation based on general FMOD practices.

To play back real-time audio data captured via FMOD, you would typically handle it through a custom DSP. Here’s a high-level example of how you might set this up:

Create a custom DSP: Define your DSP to capture incoming audio data. Use the FMOD_DSP_DESCRIPTION structure to describe your DSP and define a callback for processing audio data (FMOD_DSP_READ_CALLBACK).
Register and add the DSP to your audio system: After defining your DSP, register it with System::registerDSP and then add it to your desired audio channel using Channel::addDSP .
Data Handling in Callback: In your DSP read callback, use the lock and unlock functions to access the captured audio buffer. Here you process or modify the audio data as needed.
Playback: To play back the audio data, you can directly write it to an output channel within the DSP process callback.

CreativePS · December 30, 2024, 7:47pm

Hi,

does that mean I have to use the playSound callback? Cause what I am doing right now is that I’m using recordStart to pass it to an FMOD sound object, then I play the sound on the sound object and obtain the channel to attach it to the DSP. Then, in order to avoid the user hearing himself speak, I mute the output in the callback by clearing the out buffers.

It seems wrong. I don’t want to rely on the output device at all for this, this should work even if the user doesn’t have an output device. How could I do it properly?

I also want to use opus for encoding so I can send the samples through a server and whatnot, but first and foremost I want to solve exactly what I have stated above…

Thank you, much respect for all help!

li_fmod · January 5, 2025, 11:47pm

Hi,

Thank you for the detailed explanation.

Could you clarify what you mean by ‘playSound callback’? Are you referring to a specific FMOD callback, such as FMOD_STUDIO_EVENT_CALLBACK_SOUND_PLAYED or something else?

Your implementation sounds reasonable to me, as it works independently of the audio output device. You can verify this by setting FMOD’s output mode to OUTPUTTYPE_NOSOUND.

Could you please elaborate on how you using the output device directly? It seems your current approach relies on the FMOD mixer, which is independent of the output device and should function regardless of the selected output type, such as OUTPUTTYPE_NOSOUND (no sound) or OUTPUTTYPE_WAVWRITER (output to file).

Please note that you’ll need to handle Opus encoding yourself, as FMOD provides access only to the raw PCM buffer.

CreativePS · February 3, 2025, 11:55pm

Hi, so I am now able to capture and playback PCM audio samples.

However, I want to ask, how would I deal with the jitter buffer? I have been trying to use an external thread for the jitter buffer and make sure that once it’s filled with at least 3 samples, that it starts popping off each one at a rate of exactly 20 ms.

However, I also have a problem when multiple users join the conversation…
The sounds start to become robotic,and I am playing back every single sample with fmod. Am I supposed to use streams or something?

What could be going on?

Connor_FMOD · February 10, 2025, 6:23am

Hi,

I would suggest looking over our video playback example: Unity Integration | Scripting Examples Video Playback. It is in C#, but the idea of reducing and increasing the pitch to account for delay may help.

Would it be possible to get a recording when the audio becomes robotic? Could it possible be linked to the networking of the conversation?

Would it be possible to get access to the code?

Topic		Replies	Views
Retrieving bytes from recorded data FMOD Engine	1	3437	February 15, 2016
Accessing samples from fmod recording buffer (Unity) Unity	3	145	April 9, 2024
Callback to capture DSP rendered output audio buffer FMOD Engine android , cpp	5	566	August 6, 2023
Realtime playback from a device FMOD Engine	6	3241	January 13, 2017
Audio artifacts when playing audio generated via code FMOD Engine unity , csharp	4	40	May 16, 2025

FMOD Voice Recording access to real-time playable bytes/data (C++)

Related topics