Team LiB   Previous Section   Next Section

11.3 Garbage Collection

As explained in Chapter 4, JavaScript uses garbage collection to reclaim the memory occupied by strings, objects, arrays, and functions that are no longer in use. This frees you, the programmer, from having to explicitly deallocate memory yourself and is an important part of what makes JavaScript programming easier than, say, C programming.

A key feature of garbage collection is that the garbage collector must be able to determine when it is safe to reclaim memory. Obviously, it must never reclaim values that are still in use and should collect only values that are no longer reachable; that is, values that cannot be referred to through any of the variables, object properties, or array elements in the program. If you are the curious type, you may be wondering just how a garbage collector distinguishes between garbage to be collected and values that are still being used or that could potentially be used. The following sections explain some of the gory details.

11.3.1 Mark-and-Sweep Garbage Collection

The computer science literature on garbage collection is large and technical; the actual operation of the garbage collector is really an implementation-specific detail that may vary in different implementations of the language. Still, almost all serious garbage collectors use some variation on a basic garbage-collection algorithm known as "mark and sweep."

A mark-and-sweep garbage collector periodically traverses the list of all variables in the JavaScript environment and marks any values referred to by these variables. If any referenced values are objects or arrays, it recursively marks the object properties and array elements. By recursively traversing this tree or graph of values, the garbage collector is able to find (and mark) every single value that is still reachable. It follows, then, that any unmarked values are unreachable and are therefore garbage.

Once a mark-and-sweep garbage collector has finished marking all reachable values, it begins its sweep phase. During this phase, it looks through the list of all values in the environment and deallocates any that are not marked. Classic mark-and-sweep garbage collectors do a complete mark and a complete sweep all at once, which causes a noticeable slowdown in the system during garbage collection. More sophisticated variations on the algorithm make the process relatively efficient and perform collection in the background, without disrupting system performance.

The details of garbage collection are implementation-specific, and you should not need to know anything about the garbage collector to write JavaScript programs. All modern JavaScript implementations use some kind of mark-and-sweep garbage collection. However, JavaScript 1.1, as implemented in Netscape 3, used a somewhat simpler garbage-collection scheme that has some shortcomings. If you are writing JavaScript code to be compatible with Netscape 3, the following section explains the shortcomings of the garbage collector in that browser. Netscape 2 used an even simpler garbage-collection technique with serious flaws. Since that browser is now entirely obsolete, the details are not described here.

11.3.2 Garbage Collection by Reference Counting

In JavaScript 1.1, as implemented in Netscape 3, garbage collection is performed by reference counting. This means that every object (whether a user object created by JavaScript code or a built-in HTML object created by the browser) keeps track of the number of references to it. Recall that objects are assigned by reference in JavaScript, rather than having their complete values copied.

When an object is created and a reference to it is stored in a variable, the object's reference count is one. When the reference to the object is copied and stored in another variable, the reference count is incremented to two. When one of the two variables that holds these references is overwritten with some new value, the object's reference count is decremented back to one. If the reference count reaches zero, there are no more references to the object. Since there are no references to copy, there can never again be a reference to the object in the program. Therefore, JavaScript knows that it is safe to destroy the object and garbage collect the memory associated with it.

Unfortunately, there are shortcomings to using reference counting as a garbage-collection scheme. In fact, some people don't even consider reference counting to be true garbage collection and reserve that term for better algorithms, such as mark-and-sweep garbage collection. Reference counting is a simple form of garbage collection that is easy to implement and works fine in many situations. There is an important situation, however, in which reference counting cannot correctly detect and collect all garbage, and you need to be aware of it.

The basic flaw with reference counting has to do with cyclical references. If object A contains a reference to object B and object B contains a reference to object A, a cycle of references exists. A cycle would also exist, for example, if A referred to B, B referred to C, and C referred back to A. In cycles such as these, there is always a reference from within the cycle to every element in the cycle. Thus, even if none of the elements of the cycle has any remaining outside references, their reference counts will never drop below one and they can never be garbage collected. The entire cycle may be garbage if there is no way to refer to any of these objects from a program, but because they all refer to each other, a reference-counting garbage collector cannot detect and free this unused memory.

This problem with cycles is the price that must be paid for a simple garbage-collection scheme. The only way to prevent this problem is by manual intervention. If you create a cycle of objects, you must recognize this fact and take steps to ensure that the objects are garbage collected when they are no longer needed. To allow a cycle of objects to be garbage collected, you must break the cycle. You can do this by picking one of the objects in the cycle and setting the property of it that refers to the next object to null. For example, suppose that A, B, and C are objects that each have a next property, and the value of this property is set so that these objects refer to each other and form a cycle. When these objects are no longer in use, you can break the cycle by setting A.next to null. This means that object B no longer has a reference from A, so its reference count can drop to zero and it can be garbage collected. Once it has been garbage collected, it will no longer refer to C, so C's reference count can drop to zero and it can be garbage collected. Once C is garbage collected, A can finally be garbage collected.

Note, of course, that none of this can happen if A, B, and C are stored in global variables in a window that is still open, because those variables A, B, and C still refer to the objects. If these were local variables in a function and you broke their cycle before the function returned, they could be garbage collected. But if they are stored in global variables, they remain referenced until the window that contains them closes. In this case, if you want to force them to be garbage collected, you must break the cycle and set all the variables to null:

A.next = null;     // Break the cycle

A = B = C = null;  // Remove the last remaining external references 
    Team LiB   Previous Section   Next Section