FMOD Voice Recording access to real-time playable bytes/data (C++)

Hello,

I’ve been using FMOD recently and have already looked at the record.cpp example.

However, in order to capture the audio data, it advises to use FMOD::sound::lock and FMOD::sound::unlock in sort of an external way, even though recordStart is non-blocking.

How exactly would I fetch the bytes in a synchronized way, so that there is no popping or artifacts? I’m aware that FMOD is internally using circular buffers as well. Is there no callback I could use? I’ve tried FMOD::System::CreateStream and tried to point recordStart to there and just use PCMReadCallback, which then complains about:

“[ERR] SystemI::recordStart : Invalid sound, must be an FMOD::Sound with positive length created as FMOD_CREATESAMPLE.”, even when I do fulfill everything that FMOD complains about.

Could anyone help? Also, I’m using C++ to do this.

Thank you!

Hi,

Thank you for sharing the information.

It seems that lock and unlock are unable to access the audio buffer, you could consider using a custom DSP to capture the audio buffer directly and add it to your recording channel by using ChannelControl::addDSP.

I will share an example script below as a reference:

FMOD_DSP_DESCRIPTION tap{};
tap.pluginsdkversion = FMOD_PLUGIN_SDK_VERSION;
tap.numinputbuffers = 1;    
tap.numoutputbuffers = 1;   
tap.numparameters = 0;  

tap.process = [](FMOD_DSP_STATE *dsp_state, unsigned int numsamples, const    FMOD_DSP_BUFFER_ARRAY *inbufferarray,
                 FMOD_DSP_BUFFER_ARRAY *outbufferarray, FMOD_BOOL inputsidle, FMOD_DSP_PROCESS_OPERATION op) -> FMOD_RESULT {
    if (op == FMOD_DSP_PROCESS_PERFORM) {
        if (inputsidle) {
            // If input is idle, optionally clear the output buffer to silence
            memset(outbufferarray->buffers[0], 0, numsamples * sizeof(float));
            return FMOD_OK;
        }

        // Process audio samples
        for (unsigned int i = 0; i < numsamples; ++i) {
            outbufferarray->buffers[0][i] = inbufferarray->buffers[0][i] * 0.5f; // Reduce amplitude to prevent clipping
        }
    }

    return FMOD_OK;
};

    FMOD::DSP *myDSP;
    result = system->createDSP(&tap, &myDSP);
    ERRCHECK(result);

    result = myRecordingChannel->addDSP(0, myDSP);  // Add DSP to the channel
    ERRCHECK(result);

Unfortunately, FMOD does not support using PCMReadCallback for recording since recordStart requires a sound object created with FMOD_CREATESAMPLE, which is incompatible with creatStream that is meant for streaming playback.

Hope this helps, let me know if you have any questions.

Hi, what I essentially want is to just capture a real-time buffer in a way that I can just use sendto() on it using a simple UDP socket and just fire-and-forget it so that the audio (my voice) can be replayed in real-time on the other side, allowing for a simple voice cha functionality with FMOD.

Oh, and thank you for your response sir.

Unfortunately, FMOD does not provide built-in networking support, so you’ll need to handle the transmission yourself.

You could consider packaging audio samples inbufferarray->buffers[0][i] into a custom structure like this:

    struct Packet
    {
        char data[32];
        int datalen;
    };

Then, use your own networking API (e.g., UDP sockets) to send the data over the network. On the receiver end, unpack the data and play it back using FMOD.

If you are looking for more information, there was a discussion relate to this topic that is worth reading through:

Could you exactly make an example of how to unpack the data and play it back using FMOD? Do I have to do it through the DSP again? Otherwise, I have everything else already working and good to go.

Hi, sorry for the late response.

I’m not fully familiar with the specific details of your implementation, so I can provide a high-level explanation based on general FMOD practices.

To play back real-time audio data captured via FMOD, you would typically handle it through a custom DSP. Here’s a high-level example of how you might set this up:

  1. Create a custom DSP: Define your DSP to capture incoming audio data. Use the FMOD_DSP_DESCRIPTION structure to describe your DSP and define a callback for processing audio data (FMOD_DSP_READ_CALLBACK).
  2. Register and add the DSP to your audio system: After defining your DSP, register it with System::registerDSP and then add it to your desired audio channel using Channel::addDSP .
  3. Data Handling in Callback: In your DSP read callback, use the lock and unlock functions to access the captured audio buffer. Here you process or modify the audio data as needed.
  4. Playback: To play back the audio data, you can directly write it to an output channel within the DSP process callback.

Hi,

does that mean I have to use the playSound callback? Cause what I am doing right now is that I’m using recordStart to pass it to an FMOD sound object, then I play the sound on the sound object and obtain the channel to attach it to the DSP. Then, in order to avoid the user hearing himself speak, I mute the output in the callback by clearing the out buffers.

It seems wrong. I don’t want to rely on the output device at all for this, this should work even if the user doesn’t have an output device. How could I do it properly?

I also want to use opus for encoding so I can send the samples through a server and whatnot, but first and foremost I want to solve exactly what I have stated above…

Thank you, much respect for all help!

Hi,

Thank you for the detailed explanation.

Could you clarify what you mean by ‘playSound callback’? Are you referring to a specific FMOD callback, such as FMOD_STUDIO_EVENT_CALLBACK_SOUND_PLAYED or something else?

Your implementation sounds reasonable to me, as it works independently of the audio output device. You can verify this by setting FMOD’s output mode to OUTPUTTYPE_NOSOUND.

Could you please elaborate on how you using the output device directly? It seems your current approach relies on the FMOD mixer, which is independent of the output device and should function regardless of the selected output type, such as OUTPUTTYPE_NOSOUND (no sound) or OUTPUTTYPE_WAVWRITER (output to file).

Please note that you’ll need to handle Opus encoding yourself, as FMOD provides access only to the raw PCM buffer.