It's now possible to use AltiVec instructions in PPC LAP (assembler) functions. The lisp kernel detects the presence or absence of AltiVec and preserves AltiVec state on lisp thread switch and in response to exceptions, but the implementation doesn't otherwise use vector operations.
This document doesn't document PPC LAP programming in general. Ideally, there would be some document that did.
This document does explain AltiVec register-usage conventions in OpenMCL and explains the use of some lap macros that help to enforce those conventions.
All of the global symbols described below are exported from the CCL package. Note that lap macro names, ppc instruction names, and (in most cases) register names are treated as strings, so this only applies to functions and global variable names.
Much of the OpenMCL support for AltiVec LAP programming is based on work contributed to MCL by Shannon Spires.
Description | This variable is intitialized each time an OpenMCL session starts based on information provided by the lisp kernel. Its value is true if AltiVec is present and false otherwise. This variable shouldn't be set by user code. |
Syntax | altivec-available-p |
Description | Returns non-NIL if AltiVec is available. |
Description | Intended to control the expansion of certain lap macros. Initialized to NIL on LinuxPPC; should be initialized to T on platforms (such as MacOS X/Darwin) that require that the VRSAVE SPR contain a bitmask of active vector registers at all times. |
Syntax | with-altivec-registers reglist &body body |
Description |
Specifies the set of AltiVec registers used in
body . If
*altivec-lapmacros-maintain-vrsave-p* is true
when the macro is expanded, generates code to save the
VRSAVE SPR and updates VRSAVE to
incude a bitmask generated from the specified register
list. Generates code which saves any non-volatile vector
registers which appear in the register list, executes
body , and restores the saved non-volatile
vector registers (and, if
*altivec-lapmacros-maintain-vrsave-p* is
true, restores VRSAVE as well. Uses the
IMM0 register (r3 ) as a temporary.
|
Arguments |
|
Syntax | with-vector-buffer base n &body body |
Description |
Generates code which allocates a 16-byte aligned buffer
large enough to contain N vector registers;
the GPR base points to the lowest address of
this buffer. After processing body , the
buffer will be deallocated. The body should
preserve the value of base as long as it
needs to reference the buffer. It's intended that
base be used as a base register in
stvx and lvx instructions within
the body .
|
Arguments |
|
OpenMCL LAP functions that use AltiVec instructions must interoperate with each other and with C functions; that suggests that they follow C AltiVec register usage conventions. (vr0-vr1 scratch, vr2-vr13 parameters/return value, vr14-vr19 temporaries, vr20-vr31 callee-save non-volatile registers.)
The EABI (Embedded Application Binary Interface) used in LinuxPPC
doesn't ascribe particular significance to the vrsave
special-purpose register; on other platforms (notably MacOS), it's
used as a bitmap which indicates to system-level code which vector
registers contain meaningful values.
The WITH-ALTIVEC-REGISTERS
lapmacro generates code
which which saves, updates, and restores VRSAVE
on
platforms where this is required (as indicated by the value of
the special variable which controls this) and ignores
VRSAVE
on platforms that don't require it to be
maintained.
On all PPC platforms, it's necessary to save any non-volatile vector registers (vr20 .. vr31) before assigning to them and to restore such registers before returning to the caller.
On platforms that require that VRSAVE
be
maintained, it's not necessary to mention the "use" of vector
registers that're used as incoming parameters. It's not
incorrect to mention their use in a
WITH-ALTIVEC-REGISTERS
form, but it may be
unneccessary in many interesting cases. One can likewise assume
that the caller of any function that returns a vector value (in
vr2
has already set the apropriate bit in
VRSAVE
to indicate that this register is live.
One could therefore write a leaf function that added the bytes
in vr3
and vr2
and returned the result
in vr2
as:
(defppclapfunction vaddubs ((y vr3) (z vr2)) (vaddubs z y z) (blr))
When vector registers that aren't incoming parameters are used
in a LAP function, WITH-ALTIVEC-REGISTERS
takes
care of maintaining VRSAVE
and of saving/restoring
any non-volatile vector registers:
(defppclapfunction load-array ((n arg_z)) (check-nargs 1) (with-altivec-registers (vr1 vr2 vr3 vr27) ; Clobbers imm0 (li imm0 arch::misc-data-offset) (lvx vr1 arg_z imm0) ; load MSQ (lvsl vr27 arg_z imm0) ; set the permute vector (addi imm0 imm0 16) ; address of LSQ (lvx vr2 arg_z imm0) ; load LSQ (vperm vr3 vr1 vr2 vr27) ; aligned result appears in VR3 (dbg t)) ; Look at result in some debugger (blr))
AltiVec registers are not preserved by CATCH
and UNWIND-PROTECT
. Since AltiVec is only accessible
from LAP in OpenMCL and since LAP functions rarely use high-level
control structures, this should rarely be a problem in practice.
LAP functions which use non-volatile vector registers and which
call (Lisp ?) code which may use CATCH
or
UNWIND-PROTECT
should save those vector registers
before such a call and restore them on return. This is one of
the intended uses of the WITH-VECTOR-BUFFER
lap
macro.