The reciprocal square root should always be invoked explicitly as rsqrtf() for single precision and rsqrt() for double precision. The compiler optimizes 1.0f/sqrtf(x) into rsqrtf() only when this does not violate IEEE-754 semantics.
Copyright © 2011 NVIDIA Corporation | www.nvidia.com