Memory Management

The design of memory management scheme affects much to the flexibility and efficiency of object-oriented languages. EusLisp allocates memory to any sort of objects in a unified manner based on the Fibonacci buddy method. In this method, each of large memory pools called chunks is split into small cells which are unequally sized but aligned at Fibonacci numbers. A memory chunk is a homogeneous data container for any types of objects such as symbol, cons, string, float-vector, etc. as long as their sizes fit in the chunk. A chunk has no special attributes, like static, dynamic, relocatable, alternate, etc. EusLisp's heap memory is the collection of chunks, and the heap can extend dynamically by getting new chunks from UNIX. The expansion occurs either automatically on the fly or on user's explicit demand by calling system:alloc function. When it is managed automatically, free memory size is kept about 25% of total heap size. This ratio can be changed by setting a value between 0.1 and 0.9 to the sys:*gc-margin* parameter.

When all the heap memory is exhausted, mark-and-sweep type garbage collection runs. Cells accessible from the root (packages, classes and stacks) remain at the same place where they were. Other inaccessible cells are reclaimed and linked to the free-lists. No copying or compactification occurs during GC. When a garbage cell is reclaimed, its neighbor is examined whether it is also free, and they are merged together to form a larger cell if possible. This merging, however, is sometimes meaningless, since cons, which is the most frequently called memory allocator, requests the merged cell to be divided to the smallest cell. Therefore, EusLisp allows to leave a particular amount of heap unmerged to speed up cons. This ratio is determined by sys:*gc-merge* parameter, which is set to 0.3 by default. With the larger sys:*gc-merge*, the greater portion of heap is left unmerged. This improves the performance of consing, since buddy-cell splitting rarely occurs when conses are requested. This is also true for every allocation of relatively small cells, like three dimensional float-vectors.

SYS:GC invokes garbage collector explicitly, returning a list of two integers, numbers of free words and total words (not bytes) allocated in the heap. SYS:*GC-HOOK* is a variable to hold a function that is called upon the completion of a GC. The hook function should receive two arguments representing the sizes of the free heap and the total heap.

If "fatal error: stack overflow" is reported during execution, and you are convinced that the error is not caused by a infinite loop or recursion, you can expand the size of the Lisp stack by sys:newstack. reset should be performed before sys:newstack, since it discards everything in the current stack such as special bindings and clean-up forms of unwind-protect. After a new stack is allocated, execution starts over from the point of printing the opening message. The default stack size is 65Kword. The Lisp stack is different from the system stack. The former is allocated in the heap, while the latter is allocated in the stack segment by the operating system. If you get "segmentation fault" error, it might be caused by the shortage of the system stack. You can increase the system stack size by the limit csh command.

Sys:reclaim and sys:reclaim-tree function put cells occupied by objects back to the memory manager, so that they can be reused later without invoking garbage collection. You must be assured that there remains no reference to the cell.

memory-report and room function display statistics on memory usage sorted by cell sizes and classes respectively.

address returns the byte address of the object and is useful as a hash function when used with hash-table, since this address is unique in the process.

Peek and poke are the functions to read/write data directly from/to a memory location. The type of access should be either of :char, :byte, :short, :long, :integer, :float and :double. For an instance, (SYS:PEEK (+ 2 (SYS:ADDRESS '(a b))) :short) returns class id of a cons cell, normally 1.

There are several functions prefixed with 'list-all-'. These functions returns the list of a system resource or environment, and are useful for dynamic debugging.



sys:gc [function]

starts garbage collection, and returns a list of the numbers of free words and total words allocated.


sys:*gc-hook* [variable]

Defines a function that is called upon the completion of a GC.


sys:gctime [function]

returns a list of three integers: the count of gc invoked, the time elapsed for marking cells (in 1/60 sec. unit), and the time elapsed for reclamation (unmarking and merging).


sys:alloc size [function]

allocates at least size words of memory in the heap, and returns the number of words really allocated.


sys:newstack size [function]

relinquishes the current stack, and allocates a new stack of size words.


sys:*gc-merge* [variable]

is a memory management parameter. *gc-merge* is the ratio the ratio of heap memory which is left unmerged at GC. This unmerged area will soon filled with smallest cells whose size is the same as a cons. The default value is 0.3. The larger values, like 0.4, which specifies 40% of free heap should be unmerged, favors for consing but do harm to instantiating bigger cells like float-vectors, edges, faces, etc.


sys:*gc-margin* [variable]

is a memory management parameter. *gc-margin determines the ratio of free heap size versus the total heap. Memory is acquired from UNIX so that the free space does not go below this ratio. The default value 0.25 means that 25% of free space is maintained at every GC.


sys:reclaim object [function]

relinquishes object as a garbage. It must be guaranteed that it is no longer referenced from any other objects.


sys:reclaim-tree object [function]

reclaims all the objects except symbols traversable from object.


sys:btrace num [function]

prints the back-trace information of num depth on the Lisp stack.


sys:memory-report &optional strm [function]

prints a table of memory usage report sorted by cell sizes to the strm stream.


sys:room output-stream [function]

outputs memory allocation information ordered by classes.


sys:address object [function]

returns the address of object in the process memory space.


sys:peek [vector] address type [function]

reads data at the memory location specified by address and returns it as an integer. type is one of :char, :byte, :short, :long, :integer, :float, and :double. If no vector is given, the address is taken in the unix's process space. For example, since the a.out header is located at #x2000 on SunOS4, (sys:peek #x2000 :short) returns the magic number (usually #o403). Solaris2 locates the ELF header at #10000, and (sys:peek #x10000 :long) returns #xff454c46 whose string representation is "ELF".

If vector, which can be a foreign-string, is specified, address is recognized as an offset from the vector's origin. (sys:peek "123456" 2 :short) returns short word representation of "34", namely #x3334 (13108).

Be careful about the address alignment: reading short, integer, long, float, double word at odd address may cause bus error by most CPU architectures.



sys:poke value [vector] address value type [function]

writes value at the location specified by address. Special care should be taken since you can write to anywhere in the process memory space. Writing to outside the process space surely causes segmentation fault. Writing short, integer, long, float, double word at odd address causes bus error.


sys:list-all-chunks [function]

list up all allocated heap chunks. Not useful for other than the implementor.


sys:object-size obj [function]

counts the number of cells and words accessible from obj. All the objects reference-able from obj are traversed, and a list of three numbers is returned: the number of cells, the number of words logically allocated to these objects (i.e. accessible from users), and the number of words physically allocated including headers and extra slots for memory management. Traversing stops at symbols, i.e. objects referenced from a symbol such as property-list or print-name string are not counted.


k-okada 2013-05-21