I’m currently trying to convert a speech recognition solution work with FMOD audio, rather than the built-in Unity Audio. I’m stumped at a point in the script where the Unity method AudioClip.GetData is used to receive some data. I’m wondering if there is a way to extract the same data from FMOD sounds?
I’m aware of a method called ReadData in FMOD, but I was, unfortunately, unable to apply it to get the desired values.
I’m a bit new to working this deeply with FMOD, so any help is appreciated
There is a way to get the sample data from an audio source but there is a bit of setup to do, we will need to create a Sound to read the data from. Here is an example of how:
GetSampleData(string filePath)
/// <summary>
/// Return sample data in a byte array from an audio source using its file path
/// </summary>
/// <param name="filePath"></param>
/// <returns></returns>
private byte[] GetSampleData(string filePath)
{
// Very useful tool for debugging FMOD function calls
FMOD.RESULT result;
// Sound variable to retrieve the sample data from
FMOD.Sound sound;
// Creating the sound using the file path of the audio source
// Make sure to create the sound using the MODE.CREATESAMEPLE | MDOE.OPENONLY so the sample data can be retrieved
result = FMODUnity.RuntimeManager.CoreSystem.createSound(filePath, FMOD.MODE.CREATESAMPLE | FMOD.MODE.OPENONLY, out sound);
// Debug the results of the FMOD function call to make sure it got called properly
if (result != FMOD.RESULT.OK)
{
Debug.Log("Failed to create sound with the result of: " + result);
return null;
}
// Retrieving the length of the sound in milliseconds to size the arrays correctly
result = sound.getLength(out uint length, FMOD.TIMEUNIT.MS);
if (result != FMOD.RESULT.OK)
{
Debug.Log("Failed to retrieve the length of the sound with result: " + result);
return null;
}
// Buffer which the sample data will be copied too
System.IntPtr buffer;
// Creating a pointer to newly allocated memory
// This pointer must be released at the end of the function!
buffer = Marshal.AllocHGlobal((int)length * sizeof(byte));
// Creating the return array which will have the sample data is a readable variable type
// Using the length of the sound to create it to the right size
byte[] byteArray = new byte[(int)length];
// Retrieving the sample data to the pointer using the full length of the sound
result = sound.readData(buffer, length, out uint read);
if (result != FMOD.RESULT.OK)
{
Debug.Log("Failed to retrieve data from sound: " + result);
return null;
}
// Make sure the pointer has been populated
if (_buffer != System.IntPtr.Zero)
{
// Coping the data from our pointer to our usable array
Marshal.Copy(buffer, byteArray, 0, (int)length);
}
//Releasing the pointer
Marshal.FreeHGlobal(buffer);
//Returning the array populated with the sample data to be used
return byteArray;
}
Hopefully, calling Sound::readData this way will get you the values you are looking for!
Hello,
I have the exact same issue as the operator. I want to make use of the audioclip data in Unity, but with FMOD.
I want to use the audioclip’s data for the correct mouth synchronization in a dialogue system.
I have tried using your approach, which seems like the right approach
First of all, from which location do I get the file path? I’ve tried several file paths based on my folder structure already, but with no luck. What’s the correct path structure? Can you give me an example?
Also, I don’t seem to understand why your example requires a file path. Why not just use the event path instead to retrieve the audioclip?
The file path I am using is: "C:\Users\XXX\Music\Epic SciFi Music.wav", but this was purely for the example. There are different ways to create a sound which can be found under FMOD API | Core API Reference.
In the example, as the DSP is being added to the MasterChannelGroup it is the only one that is needed but there are a number of ways to retrieve a channel. When referencing your voice channel could you elaborate on what it is? Could you screen shot it in your FMOD Studio Project?
Hi,
Thanks for your quick apply. Much appreciated.
Oh, I had no idea that I needed the full path including the filename. Just wondering. Doesn’t this mess up the code if the game is being built?
And also, I can’t really see the benefits in this logic, because then I would have to somewhere store hundreds of paths of dialogue, which is basically already are stored in the FMOD project.
About DSP. Maybe I did it wrong. Still trying to get my head around it. I want to rule out all audio but keep the Voice for the facial animations, because I don’t want other background noise to interfere.
Maybe I need a different approach?
Thanks for your reply, but it didn’t really help me.
Everything works perfectly on the masterchannel group, but I only want the “Voice” channel.
I’ve tried to get the channelgroup via the bus, based on the path to the voice channel. The Voice bus exists. I know that for a fact because I debugged it.
However, it wont allow me to add a DSP onto that channelgroup for some reason. The addDSP requires an index, but I’m kinda confused about which index it requires and what it’s for. I’ve tried several indexes like these [-3,-2,-1,0,1,2,30], but with the same outcome.
Apart from the Start function, I’ve also tried to add the DSP in runtime and while the NPC is talking, but it prints the same error.
So the issue with this method is that the channelGroup may not be ready when you are trying to assign the DSP to it. To resolve this we need to use a couple of commands: bus.lockChannelGroup() will ensure that the channelGroup is created and will stay in memory, it will have to be unlocked in the future. (FMOD API | Studio API Reference) FMODUnity.RuntimeManager.StudioSystem.flushCommands(); forces all commands to be called before continuing to ensure that the commands we are calling are being processed by the Studio System. (FMOD API | Studio API Reference). FMODUnity.RuntimeManager.StudioSystem.flushSampleLoading() forces all sample loading and unloading to be completed so our calls can be run. (FMOD API | Studio API Reference)
if (FMODUnity.RuntimeManager.CoreSystem.getMasterChannelGroup(out currentMAster) == FMOD.RESULT.OK)
{
FMODUnity.RuntimeManager.StudioSystem.getBus("bus:/In-World/Voice", out bus);
bus.lockChannelGroup();
FMODUnity.RuntimeManager.StudioSystem.flushCommands();
FMODUnity.RuntimeManager.StudioSystem.flushSampleLoading();
bus.getChannelGroup(out ChannelGroup group);
if (FMODUnity.RuntimeManager.CoreSystem.createDSP(ref desc, out mCaptureDSP) == FMOD.RESULT.OK)
{
if (!currentMAster.hasHandle())
{
Debug.Log("No master");
return;
}
FMOD.RESULT result = currentMAster.addDSP(0, mCaptureDSP);
Debug.Log(result.ToString());
if (result != FMOD.RESULT.OK)
{
Debug.LogWarningFormat("FMOD: Unable to add mCaptureDSP to the master channel group");
}
else
success = true;
}
else
{
Debug.LogWarningFormat("FMOD: Unable to create a DSP: mCaptureDSP");
}
}
else
{
Debug.LogWarningFormat("FMOD: Unable to create a GCHandle: mObjHandle");
}
Thanks for your reply! Unfortunately It’s still not working. I’m still always getting the master channel’s audio just like before and with no errors.
I’m wondering why you are assigning “group”, if it isn’t going to be used. I did try to replace “currentMaster” with “group” but then the debug warning error shows up again. (Recent screenshot)
is it possible to get Sound from bank instead of read it from disk?
Unfortunately, it is not possible to get the sound from the bank. However, you can get the sound from the event using a callback: FMOD Engine | Studio API Reference - FMOD_STUDIO_EVENT_CALLBACK_SOUND_PLAYED. Let me know if this will work.
// Retrieving the bytes of the sound, related to PCM samples * channels * datawidth (ie 16bit = 2 bytes)
result = sound.getLength(out dataLength, FMOD.TIMEUNIT.PCMBYTES);
// Creating a pointer to newly allocated memory
// This pointer must be released at the end of use
IntPtr data = Marshal.AllocHGlobal((int)dataLength * sizeof(byte));
// Retrieving the sample data to the pointer using the full length of the sound
result = sound.readData(data, dataLength, out uint read);
I tested to get audio data from callback, but without luck.
[AOT.MonoPInvokeCallback(typeof(FMOD.Studio.EVENT_CALLBACK))]
public static unsafe FMOD.RESULT DialogueEventCallback(FMOD.Studio.EVENT_CALLBACK_TYPE type, IntPtr instancePtr, IntPtr parameterPtr)
{
switch (type)
{
case FMOD.Studio.EVENT_CALLBACK_TYPE.SOUND_PLAYED:
//1.create sound with constructor
FMOD.Sound sound = new FMOD.Sound(parameterPtr);
FMOD.RESULT result = sound.getLength(out uint dataLength, FMOD.TIMEUNIT.PCMBYTES);
IntPtr data = Marshal.AllocHGlobal((int)dataLength * sizeof(byte));
result = sound.readData(data, dataLength, out uint read);
//2. create sound with createSound()
FMOD.Sound sound2;
FMOD.CREATESOUNDEXINFO exinfo = new FMOD.CREATESOUNDEXINFO();
exinfo.cbsize = sizeof(FMOD.CREATESOUNDEXINFO);
exinfo.length = dataLength;
FMODUnity.RuntimeManager.CoreSystem.createSound(parameterPtr, FMOD.MODE.CREATESAMPLE | FMOD.MODE.OPENONLY | FMOD.MODE.OPENMEMORY_POINT, ref exinfo, out sound2);
break;
}
return FMOD.RESULT.OK;
}
I create sound with constructor from parameterPtr, but readData() returns ERR_UNSUPPORTED
When create sound with createSound() it throws errors and sound2 is null.
[FMOD] CodecOggVorbis::openInternal : failed to open as ogg
[FMOD] CodecMOD::openInternal : 'M.K.' etc ID check failed [????]
[FMOD] CodecS3M::openInternal : 'SCRM' ID check failed [
[FMOD] CodecXM::openInternal : 'Extended Module: ' ID check failed [`□??]
[FMOD] CodecIT::openInternal : 'IMPM' etc ID check failed [`□??]
[FMOD] CodecMIDI::openInternal : 'HThd' ID check failed [`□??]
[FMOD] CodecMPEG::openInternal : failed to open as mpeg
Would it be possible to elaborate on the expected behaviour?
The issue may be using a deprecated function, could you try something like this:
case FMOD.Studio.EVENT_CALLBACK_TYPE.SOUND_PLAYED:
// Retrieve the sound
FMOD.Sound sound = new FMOD.Sound(parameterPtr);
// Retrieve the length
FMOD.RESULT result = sound.getLength(out uint length, FMOD.TIMEUNIT.PCMBYTES);
if (result == FMOD.RESULT.OK)
{
// Coonvert the length into bytes
byte dataBytes = (byte)length;
// Create the data arr
byte[] data = new byte[dataBytes];
result = sound.readData(data);
if (result == FMOD.RESULT.OK)
{
Debug.Log("Read the data");
}
else
{
Debug.Log($"Failed to read data with: {result}");
}
return FMOD.RESULT.OK;
}
The issue here is combining too many modes together. If you just use FMOD.MODE.OPENONLY you should be able to successfully create the sound. FMODUnity.RuntimeManager.CoreSystem.createSound(data, FMOD.MODE.OPENONLY, ref exinfo, out sound2);
Hello,
I have the same version as on yours screenshot and Unity6000.5.
Expected behaviour is get all audio data from played sound and use it as source for haptic feedback in our library. This already works from file, but better will be use it from event callback for better compatibility with PS5 behaviour and use bank event instead of file name.
I rewrited code according yours advice, but still not able to get audio data.
[AOT.MonoPInvokeCallback(typeof(FMOD.Studio.EVENT_CALLBACK))]
public static unsafe FMOD.RESULT DialogueEventCallback(FMOD.Studio.EVENT_CALLBACK_TYPE type, IntPtr instancePtr, IntPtr parameterPtr)
{
switch (type)
{
case FMOD.Studio.EVENT_CALLBACK_TYPE.SOUND_PLAYED:
//1.get audio data from Sound created from parameterPtr and readData
FMOD.Sound sound = new FMOD.Sound(parameterPtr);
FMOD.RESULT result = sound.getLength(out uint dataLength, FMOD.TIMEUNIT.PCMBYTES);
if (result == FMOD.RESULT.OK)
{
//IntPtr data = Marshal.AllocHGlobal((int)dataLength * sizeof(byte));
//result = sound.readData(data, dataLength, out uint read);
// Coonvert the length into bytes
byte dataBytes = (byte)dataLength;
// Create the data arr
byte[] data = new byte[dataBytes];
result = sound.readData(data);
Debug.Log($"readData result:{result}");
}
//2. get audio data from Sound created by createSound() and readData
result = sound.getFormat(out FMOD.SOUND_TYPE soundType, out FMOD.SOUND_FORMAT soundFormat, out int channels, out int bits);
result = sound.getDefaults(out float frequency, out int priority);
FMOD.Sound sound2;
FMOD.CREATESOUNDEXINFO exinfo = new FMOD.CREATESOUNDEXINFO();
exinfo.cbsize = sizeof(FMOD.CREATESOUNDEXINFO);
exinfo.length = dataLength;
exinfo.format = soundFormat;
exinfo.suggestedsoundtype = soundType;
exinfo.numchannels = channels;
exinfo.defaultfrequency = (int)frequency;
result = FMODUnity.RuntimeManager.CoreSystem.createSound(parameterPtr, FMOD.MODE.OPENONLY, ref exinfo, out sound2);
Debug.Log($"createSound result:{result}");
//3. get audio data from Sound created from parameterPtr and access direct data by lock/unlock
IntPtr dataZero = IntPtr.Zero;
IntPtr dataNative = Marshal.AllocHGlobal((int)dataLength * sizeof(byte));
result = sound.@lock(0, dataLength, out dataNative, out dataZero, out uint read, out uint length2);
Debug.Log($"lock result:{result} read:{read}");
result = sound.unlock(dataNative, dataZero, read, length2);
break;
}
return FMOD.RESULT.OK;
}
sound.readData(data) return ERR_UNSUPPORTED
createSound() return ERR_INTERNAL
sound.@lock() return ERR_INVALID_PARAM
Info about sound from callback:
soundFormat - BITSTREAM
soundType - FSB
channels - 4
frequency - 48000
If I read audio data from file (wav/ogg) the info is:
soundFormat - PCM16/PCM16
soundType - WAV/OGGVORBIS
I tested both wav/ogg file as source for bank in Fmode Studio.
Sorry, I am still a bit confused. You are creating a new sound from the sound that is being passed back from the callback? Does the original sound retrieved from FMOD.Sound sound = new FMOD.Sound(parameterPtr); not have the data you need?
Would something like this work?
[AOT.MonoPInvokeCallback(typeof(FMOD.Studio.EVENT_CALLBACK))]
public static FMOD.RESULT DialogueEventCallback(FMOD.Studio.EVENT_CALLBACK_TYPE type, IntPtr instancePtr, IntPtr parameterPtr)
{
switch (type)
{
case FMOD.Studio.EVENT_CALLBACK_TYPE.SOUND_PLAYED:
//1.get audio data from Sound created from parameterPtr and readData
FMOD.Sound sound = new FMOD.Sound(parameterPtr);
FMOD.RESULT result = sound.getLength(out uint dataLength, FMOD.TIMEUNIT.MS); // <- Changed to the length of the audio in MS
if (result != FMOD.RESULT.OK)
{
Debug.Log($"Failed to get length with result: {result}");
return result;
}
//3. get audio data from Sound created from parameterPtr and access direct data by lock/unlock
IntPtr dataZero = IntPtr.Zero;
IntPtr dataNative = Marshal.AllocHGlobal((int)dataLength * sizeof(byte));
result = sound.@lock(0, dataLength, out dataNative, out dataZero, out uint read, out uint length2);
if (result != FMOD.RESULT.OK)
{
Debug.Log($"Failed to lock sound with result: {result}");
return result;
}
else
Debug.Log($"lock result:{result} read:{read}");
result = sound.unlock(dataNative, dataZero, read, length2);
if (result != FMOD.RESULT.OK)
{
Debug.Log($"Failed to get unlock sound with result: {result}");
return result;
}
break;
}
return FMOD.RESULT.OK;
}
Sound created from parameterPtr doesnt get the data. So I try different approach.
The code, which you posted, run without error. So the problem was in dataLength.
When I changed TIMEUNIT to RAWBYTES I am still able to read all bytes.
Which make sense because according to format BITSREAM FSB, TIMEUNIT.PCMBYTES returns different length.
Now is the question, what format has the data?
I need the data as source for WAVEFORMATEX.
Hi, since Connor is away, I will be continuing the investigation.
Thank you for sharing the information.
From the information you provided:
it seems like you are requesting PCM16, so the format of the data would be PCM16, which is FMOD’s standard uncompressed format for working with audio data.
Hope this helps, let me know if you have any questions.