Abnormal sound when using FMOD_OPENUSER

I want to make a case of using fmod as a component instead of using fmod as an engine and then implement the engine operation control by myself.
I referred to the sample user_created_sound.cpp and used FMOD_OPENUSER to create a user-defined sound as input, but the sound I got always had a slight “whisper” sound. By saving the dsp of pcm, the audio waveform obtained is as shown in the figure. The waveform protrudes without warning and has no frequency support:

If you zoom in on the waveform again, you will find that there are abnormal protrusions at the two sampling points before 14.240 seconds. The frame length I used is 40ms/frame. The position of 14.240 seconds is exactly an integer multiple of 40ms. (I wanted to post a screenshot, but new users can only upload one.)

When viewing the stack of the demo, two threads appear “FMOD stream thread” and “FMOD mixer thread”.
If you do not use FMOD_OPENUSER to create a stream but use an audio file to create a stream, there are also two threads “FMOD mixer thread” and “FMOD stream thread”, and one thread “FMOD file thread”. At this time, there is no problem with the listening effect, it is a normal effect.

The following is my demo code:

#include <fstream>
#include <iostream>

#include <fmod.hpp>
#include <fmod_errors.h>

#ifdef _WIN64
    #ifndef _DEBUG
        #pragma comment(lib, "fmod_vc.lib")
    #else
        #pragma comment(lib, "fmodL_vc.lib")
    #endif // _DEBUG
#endif // _WIN64

static std::ifstream* soundFile = new std::ifstream();

int main()
{
    const int kMaxChannelCount = 32;
    FMOD::System* system = nullptr;
    FMOD::Sound* sound = nullptr;
    FMOD::Channel* channel = nullptr;
    FMOD_RESULT nResult = FMOD_OK;
    nResult = FMOD::System_Create(&system);
    nResult = system->init(kMaxChannelCount, FMOD_INIT_NORMAL, nullptr);

    soundFile->open("LISA-16k.pcm", std::ios::binary);

    FMOD_CREATESOUNDEXINFO tFmodCreateSoundExInfo;
    std::memset(&tFmodCreateSoundExInfo, 0, sizeof(tFmodCreateSoundExInfo));
    tFmodCreateSoundExInfo.cbsize = sizeof(FMOD_CREATESOUNDEXINFO);
    tFmodCreateSoundExInfo.numchannels = 1;
    tFmodCreateSoundExInfo.defaultfrequency = 16000;
    tFmodCreateSoundExInfo.decodebuffersize = 640;
    tFmodCreateSoundExInfo.length = -1;
    tFmodCreateSoundExInfo.format = FMOD_SOUND_FORMAT_PCM16;
    tFmodCreateSoundExInfo.pcmreadcallback = [](FMOD_SOUND* pSound, void* pData, unsigned int nDataLen) -> FMOD_RESULT {
        std::cout << "read data " << nDataLen << std::endl;
        unsigned int read = 0;
        //auto ret = soundFile->readData(pData, nDataLen, &read);
        soundFile->read((char*)pData, nDataLen);
        if (soundFile->gcount() != nDataLen) {
            memset(pData, 0, nDataLen);
        }
        return FMOD_OK;
    };
    tFmodCreateSoundExInfo.userdata = nullptr;

    FMOD::ChannelGroup* channelGroup = nullptr;
    FMOD::Sound* pSound = nullptr;

    // Note: User flows created using FMOD_OPENUSER will experience abnormal noises
    //       If the mode is set to FMOD_LOOP_NORMAL, the effect will be even more different.
    system->createStream(nullptr, FMOD_OPENUSER /*| FMOD_LOOP_NORMAL*/, &tFmodCreateSoundExInfo, &pSound);
    // Note: This method can achieve normal sound effects
    //system->createStream("good.mp3", FMOD_LOOP_NORMAL, nullptr, &pSound);

    system->createChannelGroup("test", &channelGroup);
    system->playSound(pSound, channelGroup, true, &channel);

    channel->setPaused(false);
    system->update();

    while (true) {
        bool isPlaying = false;
        channel->isPlaying(&isPlaying);
        if (!isPlaying) {
            std::cout << "play end" << std::endl;
            break;
        }
    }
    std::cout << "play completed" << std::endl;

    return 0;
}

I think this is because your decodebuffersize is too small. Small stream buffer sizes can cause buffer starvation, leading to stuttering. More information on this issue can be found in the Stream | Streaming Issues section of our API documentation.

I think the reason this isn’t occurring with your “good.mp3” file is that the system default buffer size of 400ms is being used, which is much larger than the 640 samples (~14ms) buffer size you are explicitly setting in your exinfo. A buffer size of 19200 should give you the same 400ms buffer size that “good.mp3” has.

Can you please try setting a larger decodebuffersize and let me know if that eliminates the signal discontinuities?

1 Like

Thanks for the guidance!
I have tried decoding buffersize with large memory, but I want to use it in real-time scenarios. It is best to process the amount of data that needs to be accumulated to be the same as the size passed from the upstream.
I use FMOD_OPENUSER, FMOD_OUTPUTTYPE_NOSOUND_NRT and system->update() to achieve external control needs.

I can see how a buffer size of 400ms would be too much latency for you. 14ms is definitely too small because it is only slightly larger than our mixer update speed of 10ms, so you are likely to get stream stuttering with a buffer of this size.
Generally a latency of 50ms is sufficient for real time purposes, so perhaps a buffer size of 1920 would suit your needs better?