When programming up my audio processing algorithms on the Teensy 3.6, I sometimes find that the math operations are much slower than I was expecting. Yes, some things are super-fast: arithmetic, FIR filters, FFT. But did my code with the logarithm run so slowly? Well, it turns out that I was using the wrong function call. If you use the float-specific version of your function, you get tremendously faster floating-point speeds. Don't rely on the compiler; be explicit and call it yourself!
Being a Matlab programmer, my fingers generally type out the generic form of the function. Being naive, I had assumed that the compiler would detect my data type and automatically substitute the exact same floating-point function that I would have called myself. Apparently, I was wrong. Way wrong.
Interestingly, the logarithm and exponential/power functions do not have hardware acceleration in the Teensy. Yet, when using the explicitly floating-point version, they see a 10x increase in speed. Stunning.
Why are the explicitly floating-point versions so much faster? I don't know. But I sure as heck am going to make sure that all my future code uses them.
Tech Info: This data was measured using a Teensy 3.6 running at 180 MHz. It was programmed from the Arduino IDE (1.6.13) via the Teensyduino (1.35) add-on. My code is on my GitHub here.
Follow-Up: Why is the float-specific version ("logf()") so much faster than generic version ("log()")? I posted this to the Teensy Forum. Their replies (here) were pretty definitive. Under the hood, the generic version (ie, "log()") will do the calculations as double-precision floating point math, which is all in software. By contrasts, the floating-point version ("logf()") is for single-precision floating point math, which on the Teensy is done with hardware acceleration. That explains why the float-specific version is so much faster.