Using AltiVec in OpenMCL LAP functions

Overview

It's now possible to use AltiVec instructions in PPC LAP (assembler) functions. The lisp kernel detects the presence or absence of AltiVec and preserves AltiVec state on lisp thread switch and in response to exceptions, but the implementation doesn't otherwise use vector operations.

This document doesn't document PPC LAP programming in general. Ideally, there would be some document that did.

This document does explain AltiVec register-usage conventions in OpenMCL and explains the use of some lap macros that help to enforce those conventions.

All of the global symbols described below are exported from the CCL package. Note that lap macro names, ppc instruction names, and (in most cases) register names are treated as strings, so this only applies to functions and global variable names.

Much of the OpenMCL support for AltiVec LAP programming is based on work contributed to MCL by Shannon Spires.

Functional Reference

*altivec-available* [Special Variable]

Description This variable is intitialized each time an OpenMCL session starts based on information provided by the lisp kernel. Its value is true if AltiVec is present and false otherwise. This variable shouldn't be set by user code.

altivec-available-p [Function]

Syntax
altivec-available-p
Description Returns non-NIL if AltiVec is available.

*altivec-lapmacros-maintain-vrsave-p* [Variable]

Description Intended to control the expansion of certain lap macros. Initialized to NIL on LinuxPPC; should be initialized to T on platforms (such as MacOS X/Darwin) that require that the VRSAVE SPR contain a bitmask of active vector registers at all times.

with-altivec-registers [Lap Macro]

Syntax
with-altivec-registers reglist &body body
Description Specifies the set of AltiVec registers used in body. If *altivec-lapmacros-maintain-vrsave-p* is true when the macro is expanded, generates code to save the VRSAVE SPR and updates VRSAVE to incude a bitmask generated from the specified register list. Generates code which saves any non-volatile vector registers which appear in the register list, executes body, and restores the saved non-volatile vector registers (and, if *altivec-lapmacros-maintain-vrsave-p* is true, restores VRSAVE as well. Uses the IMM0 register (r3) as a temporary.
Arguments
reglist
A list of vector register names (vr0 .. vr31).
body
A sequence of PPC lap instructions.

with-vector-buffer [Lap Macro]

Syntax
with-vector-buffer base n &body body
Description Generates code which allocates a 16-byte aligned buffer large enough to contain N vector registers; the GPR base points to the lowest address of this buffer. After processing body, the buffer will be deallocated. The body should preserve the value of base as long as it needs to reference the buffer. It's intended that base be used as a base register in stvx and lvx instructions within the body.
Arguments
base
Any available general-purpose register.
n
An integer between 1 and 254, inclusive. (Should typically be much, much closer to 1.) Specifies the size of the buffer, in 16-byte units.
body
A sequence of PPC lap instructions.

Register usage conventions

OpenMCL LAP functions that use AltiVec instructions must interoperate with each other and with C functions; that suggests that they follow C AltiVec register usage conventions. (vr0-vr1 scratch, vr2-vr13 parameters/return value, vr14-vr19 temporaries, vr20-vr31 callee-save non-volatile registers.)

The EABI (Embedded Application Binary Interface) used in LinuxPPC doesn't ascribe particular significance to the vrsave special-purpose register; on other platforms (notably MacOS), it's used as a bitmap which indicates to system-level code which vector registers contain meaningful values.

The WITH-ALTIVEC-REGISTERS lapmacro generates code which which saves, updates, and restores VRSAVE on platforms where this is required (as indicated by the value of the special variable which controls this) and ignores VRSAVE on platforms that don't require it to be maintained.

On all PPC platforms, it's necessary to save any non-volatile vector registers (vr20 .. vr31) before assigning to them and to restore such registers before returning to the caller.

On platforms that require that VRSAVE be maintained, it's not necessary to mention the "use" of vector registers that're used as incoming parameters. It's not incorrect to mention their use in a WITH-ALTIVEC-REGISTERS form, but it may be unneccessary in many interesting cases. One can likewise assume that the caller of any function that returns a vector value (in vr2 has already set the apropriate bit in VRSAVE to indicate that this register is live. One could therefore write a leaf function that added the bytes in vr3 and vr2 and returned the result in vr2 as:

(defppclapfunction vaddubs ((y vr3) (z vr2))
  (vaddubs z y z)
  (blr))
    

When vector registers that aren't incoming parameters are used in a LAP function, WITH-ALTIVEC-REGISTERS takes care of maintaining VRSAVE and of saving/restoring any non-volatile vector registers:

(defppclapfunction load-array ((n arg_z))
  (check-nargs 1)
  (with-altivec-registers (vr1 vr2 vr3 vr27) ; Clobbers imm0
    (li imm0 arch::misc-data-offset)
    (lvx vr1 arg_z imm0)		; load MSQ
    (lvsl vr27 arg_z imm0)		; set the permute vector
    (addi imm0 imm0 16)			; address of LSQ
    (lvx vr2 arg_z imm0)		; load LSQ
    (vperm vr3 vr1 vr2 vr27)		; aligned result appears in VR3
    (dbg t))				; Look at result in some debugger
  (blr))
    

AltiVec registers are not preserved by CATCH and UNWIND-PROTECT. Since AltiVec is only accessible from LAP in OpenMCL and since LAP functions rarely use high-level control structures, this should rarely be a problem in practice.

LAP functions which use non-volatile vector registers and which call (Lisp ?) code which may use CATCH or UNWIND-PROTECT should save those vector registers before such a call and restore them on return. This is one of the intended uses of the WITH-VECTOR-BUFFER lap macro.


Last modified: Tue Jan 22 05:50:25 PST 2002