Hello!
I am writing to report a problem I ran into when implementing voice recording on iOS via a capture DSP. Here are the details of the issue, and how I resolved it (in case someone runs into this problem).
FMOD Version: 2.03.13
Platform: iPhone SE2, iOS 26.5 ((Deployed via Unity IL2CPP)
API Used: FMOD Core API (Bypassing FMOD Studio and Unity RuntimeManager)
Description
When recording on iOS, the first recording session delivers acceptable audio data. However, if the FMOD::System object remains alive between sessions, any subsequent call to System::recordStart results in digital distortion and an additive echo loop that is baked directly into the raw captured PCM bytes. This issue does not occur on Android, high-level code is identical as the same Unity project is built for both iOS and Android.
Context & Implementation Detail
The recording pipeline captures raw microphone data dynamically via an DSP_READ_CALLBACK mapped to a custom DSP on a recording ChannelGroup. The raw PCM float data is pulled directly from this stream.
The corruption and echo are present in the raw bytes immediately upon exiting the DSP callback, before any external encoders or processing layers touch the data.
Only the Core API is used in this pipeline, nothing else. Here is what was added to the UnityAppController.mm’s startUnity method:
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayAndRecord withOptions:AVAudioSessionCategoryOptionDefaultToSpeaker error:nil]; [[AVAudioSession sharedInstance] setActive:YES error:nil];
Removing the DefaultToSpeaker option did not resolve the issue.
Steps to Reproduce
- Initialize the FMOD::System Core context on an iOS device.
- Call System::recordStart using a designated recording driver.
- Read raw PCM floats via the custom DSP callback. Result on Run 1: Audio bytes are clean.
- Stop the recording session using System::recordStop.
- Start a second recording session using System::recordStart under the same active FMOD::System context.
- Read the raw PCM floats again. Result on Run 2+: The raw bytes are immediately corrupted with distortion and a compounding echo loop.
Attempted Fixes That Failed to Resolve the Issue
- Using a large, non-looping (loop: false) user sound buffer instead of a looping buffer.
- Switching the target recording device index from Driver 0 (Core Audio input) to Driver 1 (Core Audio input (Voice)).
- Calling System::mixerSuspend() upon stopping and System::mixerResume() before restarting. This actually caused the distortion to appear on the very first run, signalling that maybe some residual data is not cleaned in a correct way.
Workaround
The only way to get clean raw bytes on subsequent recordings is to completely destroy and rebuild the core system context between recording sessions:
- Call System::release() on the active FMOD::System instance.
- Instantiate a brand-new system object via FMOD::System_Create().
- Re-initialize the Core API context from scratch before calling System::recordStart again.
In order to achieve control over the creation and release of the FMOD::System core in the context of Unity, the RuntimeManager was turned off and instantiation was disabled by editing the source code of the Unity implementation.