Other Arithmetic Instructions

Note: Low Priority: Avoid automatic conversion of doubles to floats.

The compiler must on occasion insert conversion instructions, introducing additional execution cycles. This is the case for

The latter case can be avoided by using single-precision floating-point constants, defined with an f suffix such as 3.141592653589793f, 1.0f, 0.5f. This suffix has accuracy implications in addition to its ramifications on performance. The effects on accuracy are discussed in ../chapters/chapter7.html. Note that this distinction is particularly important to performance on devices of compute capability 2.x.

For single-precision code, use of the float type and the single-precision math functions are highly recommended. When compiling for devices without native double-precision support such as devices of compute capability 1.2 and earlier, each double-precision floating-point variable is converted to single-precision floating-point format (but retains its size of 64 bits) and double-precision arithmetic is demoted to single-precision arithmetic.

It should also be noted that the CUDA math library’s complementary error function, erfcf(), is particularly fast with full single-precision accuracy.