Microphone and Pitch

Hello, I’m having a hard time to get the same value than unity.
The idea is to record the microphone and get the pitch, I was using FFTSharp.

int micPosition = Microphone.GetPosition(m_SelectedMicrophone);

if (micPosition > 0)
{
    float[] samples = new float[sampleSize];

    int startPosition = micPosition - samples.Length;
    if (startPosition < 0)
        startPosition = 0;

    m_AudioSource.clip.GetData(samples, startPosition);

    double[] fftMagnitude = FftSharp.Transform.FFTmagnitude(samples.Select(s => (double)s).ToArray());

    double maxMagnitude = fftMagnitude.Max();
    int maxIndex = fftMagnitude.ToList().IndexOf(maxMagnitude);

    double frequency = (double)maxIndex * AudioSettings.outputSampleRate / sampleSize;
    Debug.Log(frequency);

    if (frequency > 80 && frequency < 1100)
        m_LastPitch = frequency;
    else
        m_LastPitch = 0;
}

But adding the dsp fft on the channel recording the mircophone on fmod, then asking for the dominant frequency…doesn’t get the same result (like 3800 vs 1000).
I’m going to share the code sample for mircophone recording (which is just the fmod sample)

    void Start()
    {
        fmodSystem = FMODUnity.RuntimeManager.CoreSystem;

        /*
            Determine latency in samples.
        */
        string name = "";
        FMOD.SPEAKERMODE speakerMode;
        FMOD.DRIVER_STATE driverState;

        FMODUnity.RuntimeManager.CoreSystem.getRecordDriverInfo(id, out name, 30, out _, out nativeRate, out speakerMode, out nativeChannels, out driverState);
            
        driftThreshold = (uint)(nativeRate * DRIFT_MS) / 1000;
        desiredLatency = (uint)(nativeRate * LATENCY_MS) / 1000;
        adjustLatency = desiredLatency;
        actualLatency = (int)desiredLatency;

        /*
            Create user sound to record into, then start recording.
        */
        exInfo.cbsize = Marshal.SizeOf(typeof(FMOD.CREATESOUNDEXINFO));
        exInfo.numchannels = nativeChannels;
        exInfo.format = FMOD.SOUND_FORMAT.PCM16;
        exInfo.defaultfrequency = nativeRate;
        exInfo.length = (uint)(nativeRate * sizeof(short) * nativeChannels);

        FMODUnity.RuntimeManager.CoreSystem.createSound("", FMOD.MODE.LOOP_NORMAL | FMOD.MODE.OPENUSER, ref exInfo, out recSound);

        FMODUnity.RuntimeManager.CoreSystem.recordStart(id, recSound, true);

        recSound.getLength(out recSoundLength, FMOD.TIMEUNIT.PCM);
        
    }

I even tried to analyze directly the sound object, like I did because I wanted to get the pitch in a file not playing. But nothing works, well I mean, I can’t make anything works.

Any inputs ?

Thanks and have a nice day !

Apologies for the delayed response.

Unless I’m looking at the wrong FftSharp library, it seems like your code snippets don’t match the current FftSharp API: GitHub - swharden/FftSharp: A .NET Standard library for computing the Fast Fourier Transform (FFT) of real or complex data

If I’ve got the wrong library, could I get you to link the library you’re using? Could I also get you to post an extended code snippet, preferably a simple class, that you’re using to test both FFTs so I can more easily reproduce the behavior you’re talking about on my end?

Hello, no problem :slight_smile:
I didn’t wanted to copy the huge file, which is mainly FMOD sample :
https://smalldev.tools/share-bin/w46tDcyr
So a link feels less intrusive.

To be fair, the main problem is the value I got with this code are far from human voice frequency.

On the other hand the “Unity version”, is doing pretty fine… but yes I have to use FFTSharp 1.10 (maybe I could use a bit higher version) because of Unity C# support.

In the end I guess Fmod DOMINANT_FREQUENCY should be working and I’m just doing bad things. Maybe the fact that I’m not adding the dsp each time ? I think I saw it was necessary in a post on this forum, but it wasn’t validated by an official.

Or do you think I should try to get the byte buffer from the record ?

Have a nice day !

After some more testing, I haven’t been able to pinpoint any issues with the FMOD FFT DSP - dominant frequencies reported by the DSP line up with the input on my end.

The behavior you’re observing may be down to a difference between the FFT settings (i.e. window), or how the sample data is handled/converted before being provided to FFTSharp. I’d recommend testing a number of different waves with known pitches as inputs on your end to try to figure out exactly what is going on.

As an aside, I believe (potentially depending on your Unity version) the latest versions of FFTSharp should be supported by Unity, since the library is able to be built with .NET standard 2.0.