Seeking micro stuttering with FFMpeg

Hi, I’m new to FMod so this question may be dumb.

the sound is created in mode FMOD_OPENUSER | FMOD_LOOP_NORMAL | FMOD_CREATESTREAM, and use a callback to fill data, just like the tutorial on dranger but replace the counterpart of SDL. It works fine but when changing video position, the sound will repeat previous buffer data until new data is decoded, which causes stuttering.
If I set the decode buffer bigger the stuttering is gone but this breaks the sync, as the sync time stamp is updated by audio packet. workaround is using system clock instead of audio clock, but I want to know the root reason.

I’ll post any code snippets if need.

thanks in advance.

Hi,

Could you link the tutorial you are referring to?

What version of FMOD are you using?

Thanks for the reply

Tutorial: ffmpeg tutorial
FMod: 2.02

code snippet:

create sound:

// sound name is used to identify sounds, so here it just the pointer address cast to long
inline void FMI_CreateSound(std::wstring&& soundName, void *userdata
    , std::function<void(FMOD_CREATESOUNDEXINFO& exinfo)>callBack) {
    memset(&exinfo, 0, sizeof(FMOD_CREATESOUNDEXINFO));
    exinfo.cbsize = sizeof(FMOD_CREATESOUNDEXINFO);       /* Required. */

    callBack(exinfo);
    exinfo.userdata = userdata;

    FMOD::Sound* sound;

    result = system->createSound(0, mode, &exinfo, &sound);
    if (result == FMOD_OK) {
        this->soundMap.emplace(soundName, SoundMapItem{sound});
    }
}

creater:

this->cFMI.FMI_CreateSound(std::forward<std::wstring>(_itos((long)ppFFMpeg)), ppFFMpeg, [] (FMOD_CREATESOUNDEXINFO& exinfo) {
			exinfo.numchannels = TARGET_CHANNEL_NUMBER;
			// match the output of swr
			exinfo.defaultfrequency = TARGET_SAMPLE_RATE;
			// if size = SDL_AUDIO_BUFFER_SIZE, will micro shutter when seeking
			// if size >> SDL_AUDIO_BUFFER_SIZE will cause lag due if sync with audio timer
			exinfo.decodebuffersize = SDL_AUDIO_BUFFER_SIZE;
			exinfo.length = SDL_AUDIO_BUFFER_SIZE;
			exinfo.format = FMOD_SOUND_FORMAT_PCM16;              
			exinfo.pcmreadcallback =                              
				[](FMOD_SOUND* sound, void* data, unsigned int datalen) {
				auto pSound = (FMOD::Sound*)sound;

				void* lockPtr_1 = nullptr;
				void* lockPtr_2 = nullptr;

				size_t prtLength_1 = 0;
				size_t prtLength_2 = 0;

				pSound->lock(0, datalen
					, &lockPtr_1, &lockPtr_2
					, &prtLength_1, &prtLength_2);

				void* userdata = nullptr;
				pSound->getUserData((void**)&userdata);

				AudioCallback(userdata, (Uint8*)data, datalen);

				pSound->unlock(lockPtr_1, lockPtr_2
					, prtLength_1, prtLength_2);

				return FMOD_OK;
			};
			exinfo.pcmsetposcallback = nullptr;                   /* User callback for seeking. */
			});

audio callback:

static void AudioCallback(void* userdata, Uint8* stream, int len) {
	// No mutex needed here as audio is paused when deleting pFFMpeg
	auto CallBackCore = [&](FFMpeg** ppFFMpeg, Setter setter, Mixer mixer) {
		if (ppFFMpeg == nullptr) {
			setter(stream, 0, len);

			return;
		}

		auto pFFMpeg = *ppFFMpeg;

		if (pFFMpeg == nullptr) {
			setter(stream, 0, len);

			return;
		}

		pFFMpeg->audio_fillData(stream, len, setter, mixer);

		return;
	};
	
	auto ppFFMpeg = (FFMpeg**)userdata;
	CallBackCore(ppFFMpeg, memset, [](void* dst, const void* src, size_t len, int volume) {
		memcpy(dst, src, len);
		});
};

pFFMpeg->audio_fillData(stream, len, setter, mixer) is alomst the same of tutorial but add mutex and asbtract the mixer and memory setter to compatible with both SDL & FMod

Tutorial: ffmpeg tutorial
FMod: 2.02

Code Snippets:

Create sound:

    inline void FMI_CreateSound(std::wstring&& soundName, void *userdata
        , std::function<void(FMOD_CREATESOUNDEXINFO& exinfo)>callBack) {
        memset(&exinfo, 0, sizeof(FMOD_CREATESOUNDEXINFO));
        exinfo.cbsize = sizeof(FMOD_CREATESOUNDEXINFO);       /* Required. */

        callBack(exinfo);
        exinfo.userdata = userdata;

        FMOD::Sound* sound;

        result = system->createSound(0, mode, &exinfo, &sound);
        if (result == FMOD_OK) {
            this->soundMap.emplace(soundName, SoundMapItem{sound});
        }
    }

Actual Creater:

            this->cFMI.FMI_CreateSound(std::forward<std::wstring>(_itos((long)ppFFMpeg)), ppFFMpeg, [](FMOD_CREATESOUNDEXINFO& exinfo) {
				exinfo.numchannels = TARGET_CHANNEL_NUMBER;
				// match SWR output
                exinfo.defaultfrequency = TARGET_SAMPLE_RATE;
				
				// if size = SDL_AUDIO_BUFFER_SIZE, will micro shutter when seeking
				// if size >> SDL_AUDIO_BUFFER_SIZE will cause lag due if sync with audio timer
				exinfo.decodebuffersize = SDL_AUDIO_BUFFER_SIZE;
				exinfo.length = SDL_AUDIO_BUFFER_SIZE;
				exinfo.format = FMOD_SOUND_FORMAT_PCM16;              
				exinfo.pcmreadcallback =                              
					[](FMOD_SOUND* sound, void* data, unsigned int datalen) {
					auto pSound = (FMOD::Sound*)sound;

					void* lockPtr_1 = nullptr;
					void* lockPtr_2 = nullptr;

					size_t prtLength_1 = 0;
					size_t prtLength_2 = 0;

					pSound->lock(0, datalen
						, &lockPtr_1, &lockPtr_2
						, &prtLength_1, &prtLength_2);

					void* userdata = nullptr;
					pSound->getUserData((void**)&userdata);

					AudioCallback(userdata, (Uint8*)data, datalen);

					pSound->unlock(lockPtr_1, lockPtr_2
						, prtLength_1, prtLength_2);

					return FMOD_OK;
				};
				exinfo.pcmsetposcallback = nullptr;                   /* User callback for seeking. */
				});
			this->cFMI.FMI_PlaySound(std::forward<std::wstring>(_itos((long)ppFFMpeg)), false);

Callback:

	static void AudioCallback(void* userdata, Uint8* stream, int len) {
		// No mutex needed here as audio is paused when deleting pFFMpeg
		auto CallBackCore = [&](FFMpeg** ppFFMpeg, Setter setter, Mixer mixer) {
			if (ppFFMpeg == nullptr) {
				setter(stream, 0, len);

				return;
			}

			auto pFFMpeg = *ppFFMpeg;

			if (pFFMpeg == nullptr) {
				setter(stream, 0, len);

				return;
			}

			pFFMpeg->audio_fillData(stream, len, setter, mixer);

			return;
		};

		auto ppFFMpeg = (FFMpeg**)userdata;
		CallBackCore(ppFFMpeg, memset, [](void* dst, const void* src, size_t len, int volume) {
			memcpy(dst, src, len);
			});
	};

pFFMpeg->audio_fillData(stream, len, setter, mixer) is almost the same as tutorial but added mutex and callback to compatible with both SDL & FMod

Hi,

Thanks for the code snippets and link.

The root reason for the stuttering is the ring buffer used by FMOD to output sound is not getting any new data so it will continue to play what is still in the buffer, as it is very small it sounds very harsh.

While this seems to solve the issue it only delays the time till you experience a stutter.

We recently created a Unity example using FMOD as the audio output which can be found here: Is Unity's built-in audio engine FMOD Studio (FMOD 5)? - #10 by jeff_fmod.

Hope this helps!

Thanks for the reply.
So is there a way to get the current playback progress in the buffer? I can use this to estimate the audio clock offset for sync. Tried following method but the whole program hangs when step to Channel::getPosition

    inline void FMI_SetPos(std::wstring&& soundName, size_t pos = 0) {
        FMI_SetSound(std::forward<std::wstring>(soundName), [&](IT it)->void {
#ifdef  _DEBUG
            size_t position = 0;
            result = it->second.channel->getPosition(&position, FMOD_TIMEUNIT_MS);
#endif //  _DEBUG            
            result = it->second.channel->setPosition(pos, FMOD_TIMEUNIT_MS);
            });
    }

FMI_SetSound just find if the given sound name exists then returns it’s iterator, and it points to this struct, which records sound & channel info.

    struct SoundMapItem    {   
        FMOD::Sound* pSound;
        FMOD::Channel* channel = 0;       
    };    

Hi,

In the Unity example above the bytesRead in the update function will give you the current position of the buffer. This will depend on how you are reading through the buffer if it is the same for you.

Update
private void Update()
    {
        /*
         * Need to wait before playing to provide adequate space between read and write positions
         */
        if (!mChannel.hasHandle() && mTotalSamplesWritten >= mAdjustedLatency)
        {
            FMOD.ChannelGroup mMasterChannelGroup;
            FMODUnity.RuntimeManager.CoreSystem.getMasterChannelGroup(out mMasterChannelGroup);
            FMODUnity.RuntimeManager.CoreSystem.playSound(mSound, mMasterChannelGroup, false, out mChannel);
        }

        if (mBuffer.Count > 0 && mChannel.hasHandle())
        {
            uint readPosition;
            mChannel.getPosition(out readPosition, FMOD.TIMEUNIT.PCMBYTES);

            /*
             * Account for wrapping
             */
            uint bytesRead = readPosition - mLastReadPosition; // <- current position
            if (readPosition < mLastReadPosition)
            {
                bytesRead += mExinfo.length;
            }

            if (bytesRead > 0 && mBuffer.Count >= bytesRead)
            {
                /*
                 * Fill previously read data with fresh samples
                 */
                IntPtr ptr1, ptr2;
                uint len1, len2;
                var res = mSound.@lock(mLastReadPosition, bytesRead, out ptr1, out ptr2, out len1, out len2);
                if (res != FMOD.RESULT.OK) Debug.LogError(res);

                // Though exinfo.format is float, data retrieved from Sound::lock is in bytes, therefore we only copy (len1+len2)/sizeof(float) full float values across
                int sampleLen1 = (int)(len1 / sizeof(float));
                int sampleLen2 = (int)(len2 / sizeof(float));
                int samplesRead = sampleLen1 + sampleLen2;
                float[] tmpBuffer = new float[samplesRead];

                mBuffer.CopyTo(0, tmpBuffer, 0, tmpBuffer.Length);
                mBuffer.RemoveRange(0, tmpBuffer.Length);

                if (len1 > 0)
                {
                    Marshal.Copy(tmpBuffer, 0, ptr1, sampleLen1);
                }
                if (len2 > 0)
                {
                    Marshal.Copy(tmpBuffer, sampleLen1, ptr2, sampleLen2);
                }

                res = mSound.unlock(ptr1, ptr2, len1, len2);
                if (res != FMOD.RESULT.OK) Debug.LogError(res);
                mLastReadPosition = readPosition;
                mTotalSamplesRead += (uint)samplesRead;
            }
        }
    }

Is it possible to break-all during the hang and try to find where the code is hanging?
Or is it possible to get a call-stack dump to see if its hanging in our code?

Hi, sorry for the delay. The hang and always return zero pos is due to the buffer size too small (1024 bytes)
I have tried to offset the timestamp but not work, and found that it becauses FMod will trying to read the buffer endlessly, while SDL won’t ask for new data before current buffer is finished.
now here comes two methods:

  1. make FMod pause when playing finishes
  2. reading and playing audio synchronously like the unity example

however, remove FMOD_CREATESTREAM flag FMod will treat the entire buffer as a whole song and never call the callback again, is there a flag or a method to let FMod only retrieves new data when buffer ends?

Unfortunately, as I mentioned FMOD uses a ring buffer so there isn’t an end to it.

Have you tried this method?

Yes, but this requires a bit more work, to rewrite several functions, that’s why I asked.
Thank you for your reply!

1 Like