Harmful GC

Posted on 2009-06-10 by Bryan Buecking :: comments enabled

During our May meeting, John Fremlin, the author of teepeedee2 which was recently slashdotted for being the world’s fastest web framework, discussed our beloved Garbage Collection. What was great about John’s presentation was that he discussed alternate ways to efficiently working around GC, and in the end improving the overall performance of your application.

John wrote, summarizing his talk:

Basically here’s a summary of my provocation (yes, I’m not entirely against GC but I think too many people overlook the alternatives):

Garbage collection fans like to compare their favourite waste of resources with manual malloc/free. This is very convenient for them but in real non-GC programs object pools, manual reference counting, and very importantly, the stack are used. These three examples can all be significantly more efficient that garbage collection for particular applications.

(1) Problems with garbage collection and complexity

Repeating n times a program taking time O(1) does not take O(n) time, but something worse than that. How much worse is also difficult to calculate and depends on how you tuned the garbage collector. When the garbage collector frequently mistakenly tenures (moves out of the local scratch memory) things which have to be collected, and therefore has to do full collections, then performance is shot to hell and there is no hope, in my dire experience.

For example, consider this Common Lisp fragment (my-some is defined because the SBCL version is very slow for some reason … to be explained later).

(defun my-some (test list)
  (loop for x in list thereis (funcall test x)))

(defun depth-same (x)
  (labels ((f (x n)
    (typecase x
      (sequence (my-some (lambda (x) (f x (1+ n))) x))
      (t (eql x n)))))                                
     (f x 1)))        

For depth 10000 lists, half the time is spent in the garbage collector (SBCL 1.0.27 on amd64).

The garbage is generated by the closures. If they are declared dynamic-extent and so stack allocated instead of going into GC memory, as below

(defun depth-same (x)
    (labels ((f (x n)
               (labels ((n (x)
                        (f x (1+ n))))
                (declare (dynamic-extent #'n))
                (typecase x
                 (sequence (my-some #'n x))
                 (t (eql x n))))))
     (declare (dynamic-extent #'f))
     (f x 1)))

This the same speed as the function above for small inputs, but for larger inputs much faster, as it does not allocate memory from the garbage collector on SBCL.

Add a comment »
comments are moderated