Custom DSP performance

I’m using a custom DSP to draw the realtime wave form of a playing mp3 in a UWP application.
The read call back is now defined as:

    private unsafe RESULT dspReadCBDo(ref DSP_STATE dsp_state, IntPtr inbuffer, IntPtr outbuffer,
        uint length, int inchannels, ref int outchannels)
        float* bufferIN = (float*)inbuffer;
        float* bufferOUT = (float*)outbuffer;
        for (int i = 0; i < length; i++)
            for (int c = 0; c < outchannels; c++)
                var val = bufferIN[i * inchannels + c];
                bufferOUT[i * outchannels + c] = val;
                bufferWave[c][i] = val;
        return FMOD.RESULT.OK;

Buffer wave is a float[][] that I use later in code to draw the wave.
Is it the best way to do it or may I have better performance? I mean, if I just set outbuffer = inbuffer, data is not passing to the outbuffer and no sound is played. Is processing every single bin necessary to have the DSP working? Is possible to just read signal and store it in the bufferWave array without the need to assigning every float to the outbuffer?

You can just do memcpy(outbuffer, inbuffer, length * inchannels * sizeof(float)); if the in and out channels are the same which they usually are (unless you specifically change the output format of the dsp).

Greetings, I’m in the process of drawing the real-time wave as well. With this technic, how do you access the output buffer in the main thread?

just write to a shared memory buffer in the dsp callback, and in the main thread read from the same buffer. If you’re displaying the buffer on the screen then writing to it wont matter if you display the same buffer more than once, or if you miss some.

To avoid tearing you would probably lock the code that reads and writes with a criticalsection/mutex, because you’d only be protecting 1 memcpy, it should be fast enough. (alternate solution is to double buffer).