Debugging facilities in the lisp kernel

In a perfect world, something like this couldn't happen:

  Welcome to OpenMCL Version x.y!
  ? (defun foo (x) 
      (declare (cons x)) 
      (cdr x))
  FOO
  ? (foo -1)  ;Oops.  Too late ...
  Unhandled exception 11 at 0x300e90c8, context->regs at #x7ffff6b8
  Continue/Debugger/eXit <enter>?

As you may have noticed, it's not a perfect world; it's rare that the cause (attempting to reference the CDR of -1, and therefore accessing unmapped memory near location 0) of this effect (an "Unhandled exception ..." message) is so obvious.

The addresses printed in the message above aren't very useful unless you're debugging the kernel with GDB (and they're often very useful if you are.)

Aside from causing an exception that the lisp kernel doesn't know how to handle, one can also enter the kernel debugger (more) deliberately:

   ? (defun classify (n)
       (cond ((> n 0) "Greater")
             ((< n 0) "Less")
             (t
                ;;; Sheesh !  What else could it be ?
               (ccl::bug "I give up.  How could this happen ?"))))
   CLASSIFY
   ? (classify 0)
   Bug in MCL-PPC system code:           ; Hmm.  This message should be changed.
   I give up.  How could this happen ?
   Continue/Debugger/eXit <enter>?

CCL::BUG isn't quite the right tool for this example (a call to BREAK or PRINT might do a better job of clearing up the mystery), but it's sometimes helpful when those other tools can't be used. (The lisp error system notices, for instance, if attempts to signal errors themselves cause errors to be signaled; this sort of thing can happen if CLOS or the I/O system are broken or missing. After some small number of recursive errors, the error system gives up and calls CCL::BUG.

If one choses Debugger (by entering a D) at the prompt above, one will see a prompt like:

   (R)egisters, (L)isp registers (F)loating-point, (B)acktrace (K)ill, e(X)it ?

If we got to this point via a call to CCL::BUG, the R, L, and F options will have no effect (and probably shouldn't even be displayed.) CCL::BUG just does an FF-CALL into the lisp kernel; when we get here via an unhandled exception, the OS kernel saves the machine state ("context") in a data structure for us, and the R, L, and F options can be used to display the contents of the registers at the point of the exception. Another function - CCL::DBG - causes a special exception to be generated and enters the lisp kernel debugger with a non-null "context":

  (defun classify2 (n)
    (cond ((> n 0) "Greater")
          ((< n 0) "Less")
          (t (dbg n))))
   CLASSIFY2
   ? (classify2 0)
   Lisp Breakpoint

   (R)egisters, (L)isp registers (F)loating-point, (B)acktrace (K)ill, e(X)it ?

CCL::DBG takes an argument, whose value is copied into the register that OpenMCL uses to return a function's primary value (arg_z, which is r23 on the PowerPC). If we were to choose the (L) option at this point, we'd see a dislay like:

   rnil = 0x01836015 
   nargs = 0
   r16 (fn) = #<Function CLASSIFY2 #x30379386>
   r23 (arg_z) = 0
   r22 (arg_y) = 0
   r21 (arg_x) = 0
   r20 (temp0) = #<26-element vector subtag = 2F @#x303793ee>
   r19 (temp1/next_method_context) = 6393788
   r18 (temp2/nfn) = #<Function CLASSIFY2 #x30379386>
   r17 (temp3/fname) = CLASSIFY2
   r31 (save0) = 0
   r30 (save1) = *TERMINAL-IO*
   r29 (save2) = 0
   r28 (save3) = (#<RESTART @#x01867f2e> #<RESTART @#x01867f56>)
   r27 (save4) = ()
   r26 (save5) = ()
   r25 (save6) = ()
   r24 (save7) = ()

From this we can conclude that the problematic argument to CLASSIFY2 was 0 (see r23/arg_z), and that I need to work on a better example.

The R option shows the values of the ALU (and PPC branch unit) registers in hex; the F option shows the values of the FPU registers.

The (B) option shows a raw stack backtrace; it'll try to identify foreign functions as well as lisp functions. (Foreign function names are guesses based on the nearest preceding exported symbol.)

If you ever unexpectedly find yourself in the "lisp kernel debugger", the output of the (L) and (B) options are often the most helpful things to include in a bug report.