摘要:
A garbage collector treats a garbage-collected heap as divided into heap regions, for each of which it maintains a respective remembered set, whose entries list the locations where references located in the heap outside that region refer to references inside that region. The remembered sets are used during space-incremental collection operations on collection sets of those regions; if the garbage collector determines that objects in the collection set are not referred to directly or indirectly from outside the collection set, it reclaims the memory space that they occupy. It places entries into the remembered sets independently of the locations at which the references were found, so any region can be chosen for inclusion in any collection set; no predetermined collection order is required. Instead, the garbage collector performs global marking operations and uses the results to select for collection-set membership the regions that it can most likely collect efficiently.
摘要:
An incremental collector that employs remembered sets to identify the locations where a mutator has modified references to objects in respective heap regions employs a thread operating concurrently with the mutator to update the remembered sets in accordance with reference mutation. Specifically, when the mutator modifies a reference in one of a plurality of “cards” into which the collector treats the heap as divided, the concurrent thread ordinarily searches the card for references in accordance with which it updates the remembered set. But it selects certain cards, in which it has observed particularly high mutation activity, as ones in which reference mutation will not cause concurrent remembered-set updating. Remembered-set updating in response to those cards' references occurs only when all mutator threads have been suspended.
摘要:
A garbage collector divides the garbage-collected heap into “cards.” It maintains a table containing a card-object table entry for each card. A card's entry contains information from which the collector can determine where any references in the card are located and thereby identify objects that may be reachable. The encoding of a card's table entry is not restricted to values that indicate the location of the object in which the card begins. Instead, its possible values additionally include ones that indicate that the card begins with a certain number of references or that an object begins at a given location in the middle of the card. The collector thereby avoids consulting object's class information unnecessarily.
摘要:
A multiprocessor, multiprogram, stop-the-world garbage collection program is described. The system initially over partitions the root sources, and then iteratively employs static and dynamic work balancing. Garbage collection threads compete dynamically for the initial partitions. Work stealing double-ended queues, where contention is reduced, are described to provide dynamic load balancing among the threads. Contention is resolved by atomic instructions. The heap is broken into a young and an old generation where semi-space copying employing local allocation buffers is used to collect the reachable objects in the young section, and for promoting objects from the young to the old generations, and parallel mark-compacting is used for collecting the old generation. Efficiency of collection is enhanced by use of card tables and linking objects, and overflow conditions are efficiently handled by linking using class pointers. The garbage collection termination using a global status word is employed.
摘要:
A heap may be marked and compacted while performing only two passes over the objects and object references in the heap. Specifically, objects and object references are traversed once during a marking phase and again during a compaction phase of split-reference, two-pass mark-compaction. Object references are updated in two steps. First, during marking, each object reference may be updated to include the relative offset within its block of the referenced object and-during compaction that offset may be added to the block's destination address resulting in a reference that points to the actual post-compaction location for the referenced object. Objects of a particular block may be rearranged, or permuted, with respect to each other within the block. However, the order between groups of objects in different blocks may be preserved across compaction.
摘要:
A technique is provided for reducing the number of write barriers executed in mutator code without compromising garbage collector performance. Advantageously, a compiler generates two forms of a mutator code—a first version with write barriers and a second version substantially without write barriers. In operation, the first version of the code may be accessed by a vtable in a “mature” near-class and the second version may be accessed by a vtable in a “nascent” near-class. According to the invention, mapping of functionally equivalent points in the first and second versions of the mutator code may be facilitated by an associated pcmap. Further, each of the first and second versions may also be associated with a respective nr_map that facilitates mapping functionally equivalent points within different branches of guard code sequences corresponding to reference-writes to non-receiver objects.
摘要:
The Hat Trick deque requires only a single DCAS for most pushes and pops. The left and right ends do not interfere with each other until there is one or fewer items in the queue, and then a DCAS adjudicates between competing pops. By choosing a granularity greater than a single node, the user can amortize the costs of adding additional storage over multiple push (and pop) operations that employ the added storage. A suitable removal strategy can provide similar amortization advantages. The technique of leaving spare nodes linked in the structure allows an indefinite number of pushes and pops at a given deque end to proceed without the need to invoke memory allocation or reclamation so long as the difference between the number of pushes and the number of pops remains within given bounds. Both garbage collection dependent and explicit reclamation implementations are described.
摘要:
An array-based concurrent shared object implementation has been developed that provides non-blocking and linearizable access to the concurrent shared object. In an application of the underlying techniques to a deque, the array-based algorithm allows uninterrupted concurrent access to both ends of the deque, while returning appropriate exceptions in the boundary cases when the deque is empty or full. An interesting characteristic of the concurrent deque implementation is that a processor can detect these boundary cases, e.g., determine whether the array is empty or full, without checking the relative locations of the two end pointers in an atomic operation.
摘要:
A novel linked-list-based concurrent shared object implementation has been developed that provides non-blocking and linearizable access to the concurrent shared object. In an application of the underlying techniques to a deque, non-blocking completion of access operations is achieved without restricting concurrency in accessing the deque's two ends. In various realizations in accordance with the present invention, the set of values that may be pushed onto a shared object is not constrained by use of distinguishing values. In addition, an explicit reclamation embodiment facilitates use in environments or applications where automatic reclamation of storage is unavailable or impractical.
摘要:
The present invention provides a technique for reducing the number of write barriers without compromising garbage collector performance or correctness. To that end, a compiler defers emitting write barriers until it reaches a subsequent instruction in the mutator code. At this point, the compiler may elide repeated or unnecessary write-barrier code so as to emit only those write barriers that provide useful information to the garbage collector. By eliminating write-barrier code in this manner, the amount of write-barrier overhead in the mutator can be minimized, consequently enabling the mutator to execute faster and more efficiently. Further, collocating write barriers after the predetermined instruction also enables the compiler to generate object code having better cache performance and more efficient use of guard code than is possible using conventional write-barrier implementations.