Android Cracking Audio Noise Issue

Hello the FMOD team,

Have you come across any cracking noise issues on Android devices?
If so, could you let us know if this is a widespread problem across the Android ecosystem or if it’s specific to certain models, periods, or manufacturers?
We’ve been facing this challenge for a while now.

We use Samsung devices for testing, including older models like the Galaxy S7 Edge (Android 8.0) and S10 Plus (Android 12).
Initially, our new devices didn’t have this problem, but the noise would come and go with Samsung software updates.
For example, our Galaxy S10 Plus experienced severe noise after an update, which then disappeared with the next update, only to return less severely with another update.
Our Galaxy S23 Ultra also encounters this noise, though it’s very subtle and infrequent, about once an hour.

Despite trying different audio APIs (OpenSL and AAudio), adjusting buffer sizes, and using FMOD_CREATESAMPLE, the issue persists.
Interestingly, we found that the noise doesn’t occur at all when using Bluetooth earphones on any of our devices.
This led us to speculate that Samsung might primarily be testing with Bluetooth earphones, neglecting wired ones.

You might be wondering if this issue arises from testing pressures or overload, but as demonstrated at the 0:35 mark in this YouTube video (https://youtu.be/si-iB-oUMtc?si=vGx7PMkxBMSYPIkF&t=35), even light casual games from other developers experience similar issues on the same devices.
(You might need to use earphones and listen closely to catch the noise in the video.)
The video, uploaded by the creator of ‘Like A Dino!’, demonstrates the same problem we’re facing, indicating they experienced the same issue with their own devices.

We’ve confirmed that ‘Like A Dino!’ was developed using Unity, featuring only one background music track.
Similar issues appear in other Unity games such as Angry Birds 2 and a game developed by another fellow developer.
In all these instances, the cracking noise occurs during simple background music playback without any complex audio effects.

Could you please share your experiences with Android devices?

Thanks a lot!

PS. I’ve never encountered any issues with streaming apps like YouTube and YouTube Music, which makes this problem even more perplexing to me.

I’m not FMOD, but Is there any chance this is a resurfacing of the wifi issue?

If it’s just Samsung, they could be disabling interrupts as discussed in 1217.
I’d be curious if the same issue occurs in oboe tester as well.

Bluetooth does use larger buffers to accommodate higher latency, so perhaps that is related. Though, if that’s the case, I’d expect the crackles to be reported as xruns and that the buffer sizes would automatically grow to accomodate. On that train of thought, I’d also be curious if the issue happens with FMOD_OUTPUTTYPE_AUDIOTRACK - it’s higher latency, but I’d also expect its behaviour to align more closely to the behaviour in youtube etc.

2 Likes

Thank you for your insight and the links!

You asked if this is the same issue resurfacing, but it’s not. As I mentioned above, the noise occurs even with OpenSL, and your links are about AAudio. In fact, I discussed this same issue here (Pop noise when playing with AAudio) four years ago, and it was resolved with a Samsung software update a few months later, as I noted here (The result of Oboe Tester 2). However, after that, a new cracking noise issue appeared with another Samsung update. With each software update, the noise has either gotten worse or better.

Remembering the AAudio noise issue, I always test with Wi-Fi off or in Airplane mode. I also test with a full battery or with a power cable connected to ensure the device runs at full performance. You might be right about Bluetooth’s larger buffer size, but I think such noise shouldn’t occur on the Galaxy S10, a premium model that supports Pro Audio mode, released long after AAudio’s first release. As mentioned earlier, it didn’t have any issues initially before problematic updates.

For our app, I believe we tested with a buffer size of 1024, but I might be wrong. However, I’m sure we tested with a 512 buffer size, and it’s not acceptable if going with 1024 is the solution in this day and age.

I want to check this again with Oboe, but I can’t right now due to ongoing projects. Testing the noise issue takes too much time and effort, and we don’t have enough time at the moment. The noise happens so randomly—sometimes very subtly, other times not at all, and then suddenly very severely. We have to test for at least an hour straight, concentrating on even small pops. Also, the same test needs to be performed on different devices for at least a week to see if it’s consistent across all devices.

Recently, I talked to a colleague about this frustrating noise issue and became curious if other audio experts like the FMOD team have had or heard of similar issues with Android devices. Thank you very much for sharing; I really appreciate it.

It would be greatly appreciated if the FMOD team could share their experiences or opinions on this as well.
I’m not asking for a solution; I’m just curious if we’re alone in the Android world.

Thanks a lot!

I would also be curious to see what kind of results you get in the Oboe test app. It doesn’t appear to be correlated to the workload of the application itself, and I expect you’ve ruled out all of the other major CPU users by enabling airplane mode and disabling Wi-Fi.

The youtube video you linked appears to be a stream issue. Only occurring with longer tracks like background music, and discontinuities in the waveform appear to be jumping to a value of 0, suggesting buffer underrun.

That said, if you are still getting the exact same problem with FMOD_CREATESAMPLE and larger buffer size then there must be more to it in your case. You could try leaving it as a stream and increasing the stream buffer size with System::setStreamBufferSize, just in case you haven’t tried that specific configuration yet.

I agree it shouldn’t, but if the issue seems to come and go with OS updates then it could be an issue with the OS.

Outside of the stream/performance/latency issues already discussed I haven’t had any situations where there was a random inexplicable bout of crackling after hours of running an application. From what you’ve described it sounds like high device workload and streaming issues can be ruled out. The other options are the OS, which sounds like it could be a factor, or the hardware itself. If you can manage to get a screen recording of it reproducing that would rule out hardware. I will chat to the Dev team about setting up a soak test on a Samsung device. If there is a problem with FMOD we should get some pops. If there are no pops, everything would be pointing towards an issue with the OS itself.

1 Like

Wow, thank you so much for the detailed response!

I revisited all the threads about the old AAudio noise issue that I posted on the FMOD forum here.
I realized the GitHub page linked above was from your team, written after talking with me about the AAudio noise, haha.
After reading them all again, it’s clear that this issue is different from the old one.
As mentioned, the main differences are that it happens even with OpenSL and WiFi off.

I really want to test this with Oboe again, but I can only do that sometime later as mentioned earlier.

In the past, I used the Galaxy S10 Plus as my main device specifically to catch this issue 24/7, making it easier to spot the random noise at any time.
Currently, I’m using a Galaxy S23 Ultra, and with this device, the noise is very rare and subtle, happening only about once an hour.
It’s easy to miss the exact moment, and I think your team might face the same challenge.
That’s why I said it takes so much energy and time to test it.

Anyway, thank you very much for your interest and insights.

Oh, regarding your comment about the YouTube video:
“The YouTube video you linked appears to be a stream issue. Only occurring with longer tracks like background music, and discontinuities in the waveform appear to be jumping to a value of 0, suggesting buffer underrun.”

I thought so too, but the same thing happened with many other games.
Angry Birds 2 is another example.
Whenever I encountered such an issue, I checked the game engine, and they all were made using Unity, which, as you know, uses the old FMOD engine.
On the other hand, I never faced any issue with other apps like YouTube, YouTube Music, Google Play Music, or the MX Player on any Android devices so far.
I’m not saying it’s Unity or FMOD’s fault, but that was my observation.

In my old thread regarding the Wi-Fi-related noise issue, I mentioned:
“no problem with games so far personally, because Unity rarely supports AAudio yet, thus most Unity games still use OpenSL ES.”

This indicates that any Unity games didn’t have a noise issue with the same device in the past.
I’m not sure if all Unity games’ audio APIs suddenly switched to AAudio.
If that’s the case, do you think a stream issue could be the cause, which wasn’t the case in the past?

Also, are 5 to 20 seconds of FMOD_CREATESAMPLE sounds considered long tracks?
One of our old music games had the same noise issue, and it loaded user sound files of those lengths into memory as FMOD_CREATESAMPLE.

I just remembered something that makes noise testing even harder. I don’t know if this is specific to Samsung, but their audio output mechanisms seem very strange.

After revisiting the old threads I posted here, I remembered that screen recording stopped the noise, and it immediately returned once the screen recording was finished. Because of this, I had to use a separate camera to record the noise for you at that time.

Another recent discovery is that screen recording increases the AAudio output volume. For reference, Samsung’s AAudio volume level is lower than their OpenSL output, and we discussed this on the forum a long time ago. The issue has never been fixed, even with their latest models, so we gave up on it. Interestingly, we noticed the volume gets as loud as OpenSL immediately after the screen recording starts, and this effect persists until the app is closed.

This is so complicated…

I think this would just be because music streaming apps can get away with copping the latency and massive buffer size a lot more than games can. With music streaming, you don’t have visuals to compare to, and with video streaming, there is no real time feedback so it doesn’t matter how long they buffer the audio and video for.

I think stream stuttering is the most likely cause of the issue for anyone using Unity’s audio system, and I think this mostly because it only affects longer audio files, and you can’t configure stream settings in any detail in Unity to rectify such issues (Though I could be wrong, there may be some System::setStreamBufferSize that I am unaware of) .

In FMOD Studio we have a default duration of 10s before assets are automatically imported as streams- using that as a bench mark, I would say 10-20s duration audio files should probably be played as streams.

1 Like

Thank you for your insight.

I agree that those streaming apps can have as large a buffer size as they want, which could be the main reason they haven’t experienced such noise issues, rather than some special audio technique they’re using.

As you know, Unity hides many direct controls to simplify things for non-tech-savvy developers, offering instead some simple but obscure control options. That’s why we prefer implementing our own version for low-level access and controls, and this is also the reason we prefer using the FMOD Engine provided here.

Unity offers the DSP Buffer Size setting with options like “Best latency,” “Good latency,” and “Best performance” in the Audio section. I think this setting affects the stream buffer size too, as is typical of Unity.

However, I need clarification on one point. You mentioned,

In FMOD Studio, we have a default duration of 10s before assets are automatically imported as streams. Using that as a benchmark, I would say 10-20s duration audio files should probably be played as streams.

This confused me.
If an audio file exceeds a certain length, will the sound instance be created as FMOD_CREATESTREAM, even if I created the sound with the FMOD_CREATESAMPLE flag? (FMOD Core API)

Thanks again!

One more question!

In your manual on the System::setStreamBufferSize() page, it says:

Stream may still stutter if the codec uses a large amount of CPU time, which impacts the smaller, internal ‘decode’ buffer. The decode buffer size is changeable via FMOD_CREATESOUNDEXINFO.

Does this include scenarios where multiple MP3 sounds, like 50 for instance, are played simultaneously? If so, what’s the recommended size for the decode buffer size? It seems the default size is 400.

If you are creating a sound using the FMOD Core API, FMOD_CREATESAMPLE will load the file into memory, and FMOD_CREATESTREAM will stream the file, regardless of duration.
I just meant that FMOD Studio API uses 10s as the threshold for streaming assets, so 10s would be a consistent threshold for playing streams using the FMOD Core API as well.

I should also mention that most platforms, including Android, can only play less than 10 streams simultaneously before you can expect to hear streaming issues, so 50 probably isn’t a good example. In any case, each stream will have its own decode buffer, and if there are a large number of streams playing that will affect CPU time, which will then impact the decode buffer. The only general statement I can make on determining an appropriate buffer size is “Long enough that you don’t notice stuttering, short enough that you don’t don’t notice latency”.

1 Like

Thanks, I learned some new things from your response.

But I’m a bit confused by this part of the page you linked:

Because streaming assets are loaded as they play and must be buffered before they can begin playback, an instrument that plays a streaming asset may exhibit latency when triggered. It is therefore best to use non-streaming assets in instruments that require low latency.

Your manual on System::setStreamBufferSize() says:

Does not affect latency of playback. All streams are pre-buffered (unless opened with FMOD_OPENONLY), so they will always start immediately.

So I’ve always confidently increased the stream buffer size, but now I’m a bit uncertain because the page seems to conflict with this. Could you clarify this for me?

Also, regarding the decode buffer size, you mentioned:

Long enough that you don’t notice stuttering, short enough that you don’t notice latency.

However, your manual on FMOD_CREATESOUNDEXINFO doesn’t mention latency at all, while the sections on System::setStreamBufferSize() and System::setDSPBufferSize() do. Adding a clear explanation about the decode buffer size in FMOD_CREATESOUNDEXINFO would help reduce confusion.

Your team may have omitted that part because the correlation between buffers and latency is common knowledge, or because it’s explained elsewhere. I’m bringing this up after reading the explanation on System::setStreamBufferSize(), which might confuse others as well.

Thanks!

It is confusing, but both are technically correct.

In the first case, playing a streaming asset through an FMOD Studio Instrument, the sound will be loaded with the FMOD_NONBLOCKING flag internally, which means it will poll until the stream has finished buffering, preventing immediate playback and introducing latency.

In the second case, playing a stream with the Core API, you call System::createStream, which will start buffering the stream synchronously, blocking the calling thread until buffering completes. This means that when you go to play the sound with System::playSound it can start playing back immediately, thus there is no latency.

This is not explained well at all, so I will create a task to update the docs. Thank you for bringing this to our attention!

Also a very good point- perhaps we should keep this explanation in a centralized place and have the System::setStreamBufferSize(), System::setDSPBufferSize() and FMOD_CREATESOUNDEXINFO link to it rather than having scattered pieces of information throughout our docs. I have created a separate task to address this issue.

Please let me know if you have any other thoughts or concerns!

1 Like

Thanks!
Now it’s very clear on the first question, and I’m happy to have learned something new I wouldn’t have known if I didn’t ask :slight_smile:

But I still need some clarification on the decodebuffersize of FMOD_CREATESOUNDEXINFO.

I think increasing the decode buffer size can affect latency when using the FMOD Studio API.
However, with the FMOD Core API, as you mentioned, stream sounds are pre-buffered and played immediately.
So, increasing the decode buffer size shouldn’t affect latency and should only help prevent stuttering.

It seems like sticking with higher stream buffer sizes and decode buffer sizes would always be ideal unless we set them to excessively high values that consume too many resources.
Am I right about this?

I had a thought pop up today and wanted to clarify something.

The explanation:

Does not affect latency of playback. All streams are pre-buffered (unless opened with FMOD_OPENONLY), so they will always start immediately,

seems to apply only to the initial playback.

I think both the stream buffer size set via System::setStreamBufferSize() and the decode buffer size set via decodebuffersize might increase latency when playing after seek or stop operations, although they won’t be a problem for the first playback.

Is this correct?

If that’s the case, I think even with the Core API, FMOD_CREATESTREAM sounds can’t be used in certain scenarios, like the example game below:

  1. A user can load their custom mp3 file as their background music.
  2. The user can record their live drum performance while listening to that music, with each user input recorded as timestamps.
  3. After recording, the user can replay their performance and even freely navigate like an audio player.

To play the drum sounds at low latency, they need to be loaded as FMOD_CREATESAMPLE.
But the background music is long, so it will be loaded as FMOD_CREATESTREAM.
Also, as you advised, we may need to increase the stream buffer size for the long music, and maybe even the decode buffer.

Big stream and decode buffer sizes won’t matter at initial playback.
But when the user navigates or plays back their recorded data, the background music might start a bit late due to buffering, causing off-timing with the drum performance.
This kind of music game could become problematic, I think.

When using the Core API, a large stream buffer size and decode buffer size won’t affect the latency of a stream, and if there’s a possibility it is causing stuttering, then yes you are better off making them large.

Sort of- the time it takes to call System::playSound on successive calls does increase proportionally to the size of the decode buffer size, and inverse proportionally to the stream buffer size. The time taken is not the same as the actual buffer size though since data can load faster than real time. Seeking on the other hand flushes the existing buffers and then does a slow seek to the requested point in the file, so it is essentially restarting the stream as in the stopped stream case, and taking additional time to get to the correct position. Here are the results of some testing for your interest:

Time taken for subsequent calls playSound

Sizes (ms) Decode buffer Stream buffer
400 1 ms 27 ms
800 29 ms < 1 ms
1600 57 ms < 1 ms
3200 120 ms < 1 ms

Time taken to call setPosition(5000, FMOD_TIMEUNIT_MS)

Sizes (ms) Decode buffer Stream buffer
400 30 ms 28 ms
800 60 ms 28 ms
1600 88 ms < 1 ms
3200 120 ms < 1 ms

Unless you are playing back the drum samples in a separate thread to your background music, or you loaded the stream with FMOD_NONBLOCKING then your background music shouldn’t fall out of sync with your drum timestamps because calls to Channel::setPosition and System::playSound are blocking operations. If there are any associated visuals running in a separate thread, there could be 120 ms of latency between the visuals and audio in the worst case scenario of a 3.2 sec decode buffer size, but otherwise the audio should stay in sync.

1 Like

Thank you so much for your detailed and insightful answer! I apologize for the delayed response – I’ve been engrossed in a new project and only recently had the chance to revisit this.

Your explanation was very helpful. If I may, I’d like to summarize my understanding to ensure I’ve grasped the key points correctly:

  1. Restarting or seeking operations involve flushing buffers, which introduces latency unlike the initial playback.
  2. As long as the stream sound isn’t loaded with FMOD_NONBLOCKING and setPosition() and playSound() are called on the same thread, all sounds should remain in sync with each other, even if there’s overall latency.
  3. Overall latency can be reduced by loading sounds as FMOD_CREATESAMPLE where possible and by reducing the decode buffer size.

Have I understood these points correctly?

I do have one area of confusion I’d appreciate your clarification on. Earlier, you mentioned the principle of

“Long enough that you don’t notice stuttering, short enough that you don’t notice latency”

in relation to buffer sizes. This aligns with common programming wisdom, and I initially assumed it would apply to all buffer types, including the stream buffer size.

However, your explanation suggests that larger stream buffer sizes can actually reduce latency, which seems counterintuitive at first glance. The test results you provided do support this, but I’m curious about the underlying logic.

Could you explain why stream buffers work differently from DSP and decode buffers?
Is it analogous to having larger RAM or CPUs with bigger caches in terms of performance benefits?

Thank you again for your patience and expertise. I’m looking forward to understanding this concept more deeply.

That is correct.

They will stay in sync, in that their clocks will update at the same rate, but if you want them to start at the exact same time you should consider using ChannelControl::setDelay.

Also correct.

I dug through the calls and found the source of the stream’s time variation- there is an atomic flag for checking the stream’s state during the setPosition call. With smaller buffer sizes, other threads are checking this same flag more frequently, making the atomic read operation more likely to be blocked, thus increasing the average time to call setPosition in my test. At larger buffer sizes, this atomic flag is checked less frequently, so when we check the stream’s state it is less likely to be blocked.
So, setting the stream’s buffer size does not appear to affect time taken on subsequent calls to Channel::setPosition.

1 Like

Thanks so much for clearing things up! It’s really helped me understand the whole picture.

I have one last question about your final statement:

So, setting the stream’s buffer size does not appear to affect time taken on subsequent calls to Channel::setPosition.

At first, this seemed to contradict your earlier explanations, but I’m wondering if you meant either:

  1. Setting a larger stream buffer size doesn’t affect the time taken for subsequent Channel::setPosition calls, or
  2. The stream buffer size doesn’t directly affect Channel::setPosition call times, but it does have an indirect effect through the atomic flag mechanism you described.

Oh, and does this also apply to playSound calls for restarting sounds?

This is a more precise explanation, thank you!

Yes it does, playSound relies on many of the same internal processes as setPosition. It is roughly equivalent to calling setPosition(0).

1 Like

Thank you!

1 Like