Table of Contents
Release 0.10 or later of OpenMCL uses a different memory management scheme than previous versions did. Those earlier versions would allocate a block of memory (of specified size) at startup and would allocate lisp objects within that block. When that block filled with live (non-GCed) objects, the lisp would signal a "heap full" condition. The heap size imposed a limit on the size of the largest object that could be allocated.
The new strategy involves reserving a very large (1GB, by default) block at startup and consuming (and relinquishing) its contents as the size of the live lisp heap data grows and shrinks. After the initial heap image loads and after each full GC, the lisp kernel will try to ensure that a specified amount (the "lisp-heap-gc-threshold") of free memory is available. The inital value of this kernel variable is 4MB; it can be manipulated from Lisp (see below.)
The large reserved memory block consumes very little in the way of system resources; memory that's actually committed to the lisp heap (live data and the "threshold" area where allocation takes place) consumes finite resources (physical memory and swap space). The lisp's consumption of those resources is proportional to its actual memory usage, which is generally a good thing.
This scheme is much more flexible than the old one, but it may also increase the possibility that those resources can become exhausted. Neither the new scheme nor the old handles that situation gracefully; under the old scheme, a program that consumes lots of memory may have run into an artificial limit on heap size before exhausting virtual memory.
The -R or --heap-reserve command-line option can be use to limit the size of the reserved block and therefore bound heap expansion. Running
> openmcl --heap-reserve 8M
would provide an execution environment that's very similar to that provided by earlier OpenMCL versions.
For many programs, the following observations are true to a very large degree:
Most heap-allocated objects have very short lifetimes (“are ephemeral”): they become inaccessible soon after they're created.
Most non-ephemeral objects have very long lifetimes: it's rarely productive for the GC to consider reclaiming them, since it's rarely able to do so. (An object that's survived a large number of GCs is likely to survive the next one. That's not always true of course, but it's a reasonable heuristic.)
It's relatively rare for an old object to be destructively modified (via SETF) so that it points to a new one, therefore most references to newly-created objects can be found in the stacks and registers of active threads. It's not generally necessary to scan the entire heap to find references to new objects (or to prove that such references don't exists), though it is necessary to keep track of the (hopefully exceptional) cases where old objects are modified to point at new ones.
“Ephemeral” (or “generational”) garbage collectors try to exploit these observations: by concentrating on frequently reclaiming newly-created objects quickly, it's less often necessary to do more expensive GCs of the entire heap in order to reclaim unreferenced memory. In some environments, the pauses associated with such full GCs can be noticable and disruptive, and minimizing the frequency (and sometimes the duration) of these pauses is probably the EGC's primary goal (though there may be other benefits, such as increased locality of reference and better paging behavior.) The EGC generally leads to slightly longer execution times (and slightly higher, amortized GC time), but there are cases where it can improve overall performance as well; the nature and degree of its impact on performance is highly application-dependant.
Most EGC strategies (including the one employed by OpenMCL) logically or physically divide memory into one or more areas of relatively young objects (“generations”) and one or more areas of old objects. Objects that have survived one or more GCs as members of a young generation are promoted (or “tenured”) into an older generation, where they may or may not survive long enough to be promoted to the next generation and eventually may become “old” objects that can only be reclaimed if a full GC proves that there are no live references to them. This filtering process isn't perfect - a certain amount of premature tenuring may take place - but it usually works very well in practive.
It's important to note that a GC of the youngest generation is typically very fast (perhaps a few milliseconds on a modern CPU, depending on various factors), OpenMCL's EGC is not concurrent and doesn't offer realtime guarantees.
OpenMCL's EGC maintains three [1]ephemeral generations; all newly created objects are created as members of the youngest generation. Each generation has an associated threshold, which indicates the number of bytes in it and all younger generations that can be allocated before a GC is triggered. These GCs will involve the target generation and all younger ones (and may therefore cause some premature tenuring); since the older generations have larger thresholds, they're GCed less frequently and most short-lived objects that make it into an older generation tend not to survive there very long.
The EGC can be enabled or disabled under program control; under some circumstances, it may be enabled but inactive (because a full GC is imminent.) Since it may be hard to know or predict the consing behavior of other threads, the distinction between the “active” and “inactive” state isn't very meaningful, especially when native threads are involved.
[This is a new feature in OpenMCL, as of version 0.14-031018.]
After a full GC finishes, it'll try to ensure that at least (LISP-HEAP-GC-THRESHOLD) of virtual memory are available; objects will be allocated in this block of memory until it fills up, the GC is triggered, and the process repeats itself.
Many programs reach near stasis in terms of the amount of logical memory that's in use after full GC (or run for long periods of time in a nearly static state), so the logical address range used for consing after the Nth full GC is likely to be nearly or entirely identical to the address range used by the N+1th full GC.
By default (and traditionally in OpenMCL), the GC's policy is to “release” the pages in this address range: to advise the virtual memory system that the pages contain garbage and any physical pages associated with them don't need to be swapped out to disk before being reused and to (re-)map the logical address range so that the pages will be zero-filled by the virtual memory system when they're next accessed. This policy is intended to reduce the load on the VM system and keep OpenMCL's working set to a minimum.
For some programs (especially those that cons at a very high rate), the default policy may be less than ideal: releasing pages that're going to be needed almost immediately - and zero-fill-faulting them back in, lazily - incurs unnecessary overhead. (There's a false economy associated with minimizing the size of the working set if it's just going to shoot back up again until the next GC.) A policy of “retaining” pages between GCs might work better in such an environment.
Functions described below give the user some control over this behavior. An adaptive, feedback-mediated approach might yield a better solution.
lisp-heap-gc-threshold
Returns the value of the kernel variable that specifies the amount of free space to leave in the heap after full GC.
lisp-heap-gc-threshold new-threshold
Sets the value of the kernel variable that specifies the amount of free space to leave in the heap after full GC to new-value, which should be a non-negative fixnum. Returns the value of that kernel variable (which may be somewhat larger than what was specified).
The requested new lisp-heap-gc-threshold.
use-lisp-heap-gc-threshold
Tries to grow or shrink lisp's heap space, so that the free space is (approximately) equal to the current heap threshold. Returns NIL
egc arg
Enables the EGC if arg is non-nil, disables the EGC otherwise. Returns the previous enabled status. Although this function is thread-safe (in the sense that calls to it are serialized), it doesn't make a whole lot of sense to be turning the EGC on and off from multiple threads ...
a generalized boolean
egc-enabled-p
Returns T if the EGC was enabled at the time of the call, NIL otherwise.
egc-active-p
Returns T if the EGC was active at the time of the call, NIL otherwise. Since this is generally a volatile piece of information, it's not clear whether this function serves a useful purpose when native threads are involved.
egc-configuration
Returns, as multiple values, the sizes in kilobytes of the thresholds associated with the youngest ephemeral generation, the middle ephemeral generation, and the oldest ephemeral generation
configure-egc generation-0-size generation-1-size generation-2-size
If the EGC is currently disabled, puts the indicated threshold sizes in effect and returns T, otherwise, returns NIL. (The provided threshold sizes are rounded up to a multiple of 64Kbytes in OpenMCL 0.14 and to a multiple of 32KBytes in earlier versions.)
the requested threshold size of the youngest generation, in kilobytes
the requested threshold size of the middle generation, in kilobytes
the requested threshold size of the oldest generation, in kilobytes
gc-retain-pages arg
Tries to influence the GC to retain/recycle the pages allocated between GCs if arg is true, and to release them otherwise. This is generally a tradeoff between paging and other VM considerations.
a generalized boolean
LinuxPPC imposes a 2GB limit on the address space of a process. It's hard to find more than 1GB of free, contiguous memory at startup under either LinuxPPC or DarwinPPC, so the 1GB figure that OpenMCL reserves seems to be as close to "infinity" as is practical.
The changes in heap allocation strategy don't affect the limits (ARRAY-DIMENSION-LIMIT, etc.) on the maximum size of an allocatable lisp object. Those limits are imposed by OpenMCL's tagging scheme; ARRAY-DIMENSION-LIMIT could increase by another 5 bits or so (from 24 bits to 29 : you don't want bignums to be valid array indices, and if you think you do, you're wrong ...), but doing so would add a few bytes of overhead to every array and array-like object. This doesn't seem like a worthwhile tradeoff.
SAVE-APPLICATION identifies code vectors and the pnames of interned symbols and copies these objects to a "pure" area of the image file it creates. (The "pure" area accounts for most of what the ROOM function reports as "static" space.)
When the resulting image file is loaded, the pure area of the file is now memory-mapped with read-only access. Code and pure data are paged in from the image file as needed (and don't compete for global virtual memory resources with other memory areas.)
Code-vectors and interned symbol pnames are immutable : it is an error to try to change the contents of such an object. Previously, that error would have manifested itself in some random way. In the new scheme, it'll manifest itself as an "unhandled exception" error in the Lisp kernel. The kernel could probably be made to detect a spurious, accidental write to read-only space and signal a lisp error in that case, but it doesn't yet do so.
The image file should be opened and/or mapped in some mode which disallows writing to the memory-mapped regions of the file from other processes. I'm not sure of how to do that; writing to the file when it's mapped by OpenMCL can have unpredictable and unpleasant results. SAVE-APPLICATION will delete its output file's directory entry and create a new file; one may need to exercise care when using file system utilities (like tar, for instance) that might overwrite an existing image file.
Sat Nov 8 18:01:35 MST 2003
[1] The number 3 is hardwired into the code, for no particularly good reason. It's possible to effectively use fewer ephemeral areas (by setting the threshold of one or more to values that're less than or equal to the thresholds of younger generations.)