I find bout of your claims hard to believe as OSX has a sound architecture similar to PulseAudio.
Are the 3ms measured or is it just what OSX tells you ?
A buffer of 3ms at 48kHz holds 144 samples. That means shoveling 144 fresh samples (per channel, ofc) ~333 times a second and sending them to the sound card immediately. That may be possible if your sound card supports resampling (and minor magic) and only without a sound server (OSX uses one). Either that or you have an impressive cpu. Feel free to correct me at any point.
edit: PS This is only the program->sound_card part. Programs themselves add a ton of latency and sound cards add to it as well. In reality even 10ms is beyond perfect conditions.
The problem isn't the CPU, the problem is the OS going out for lunch, and then designs that assume the OS will go our for lunch (i.e. deep buffers everywhere)... The CPU can move data between devices with sub-microsecond latencies...
I thought I'd try and measure this at some resolution (since my phone can capture at 120 FPS, I should be able to see 8 millisecond increments in a video).
Stepping frame by frame through the video I took (https://youtu.be/IHmC-q_iPiE) at the point where I press the key to sound a note, there's a 4 to 5 frame latency until I can visually see the speaker membrane move, so that's about 33 to 41 milliseconds.
Are the 3ms measured or is it just what OSX tells you ?
A buffer of 3ms at 48kHz holds 144 samples. That means shoveling 144 fresh samples (per channel, ofc) ~333 times a second and sending them to the sound card immediately. That may be possible if your sound card supports resampling (and minor magic) and only without a sound server (OSX uses one). Either that or you have an impressive cpu. Feel free to correct me at any point.
edit: PS This is only the program->sound_card part. Programs themselves add a ton of latency and sound cards add to it as well. In reality even 10ms is beyond perfect conditions.