Increasing the Real Channels from 32 → 64 seems to have fixed most of these issues.
What I still don’t understand completely is why some sounds play after like 30 seconds they should’ve played?
EDIT:
The 64 channels reflected in build only after I added Windows platform specifically, having the Default platform at 64 didn’t work. Guessing it’s the same issue as here: Real Voice Channels always limited at 32.