摘要:
A garbage collector employs a plurality of task queues for a parallel-execution operation in a garbage-collection cycle. Each task queue is associated with a different ordered pair of the threads that perform the parallel-execution operation in parallel. One of the threads, referred to as that task queue's “enqueuer” thread, is the only one that can “push” onto that queue an identifier of a dynamically identified task. The other thread, referred to as that task queue's “dequeuer,” is the only one that can “pop” tasks from that task queue for execution. Since, for each task queue, there is only one thread that can “push” task identifiers on to it and only one thread that can “pop” task identifiers from it, the garbage collector can share dynamically identified tasks optimally among its threads without suffering the cost imposed by making combinations of otherwise separate machine instructions atomic.
摘要:
In response to source code that represents instructions for dynamically allocating memory to objects, a compiler/interpreter produces instructions that implement a garbage collector. The garbage collector operates in garbage-collection cycles, which include parallel-execution operations such as locating reachable objects. Each thread maintains a respective task queue onto which it pushes identifiers of objects thus found and from which it pops those identifiers in order to begin the tasks of locating the further objects to which objects specified by the thus-popped identifiers refer. A thread's access to its respective task queue ordinarily occurs on a last-in, first-out basis, but the access mode switches to a first-in, first-out basis if the number of task-queue entries exceeds a predetermined threshold.
摘要:
An array-based concurrent shared object implementation has been developed that provides non-blocking and linearizable access to the concurrent shared object. In an application of the underlying techniques to a deque, the array-based algorithm allows uninterrupted concurrent access to both ends of the deque, while returning appropriate exceptions in the boundary cases when the deque is empty or full. An interesting characteristic of the concurrent deque implementation is that a processor can detect these boundary cases, e.g., determine whether the array is empty or full, without checking the relative locations of the two end pointers in an atomic operation.
摘要:
A “garbage collector” employed to reclaim memory dynamically allocated to data objects employs multiple execution threads to perform a parallel-execution operation and its garbage-collection cycle. A thread executes tasks that it selects from lists whose entries represent tasks dynamically identified during other tasks' performance. When a thread fails to find a task in one of these lists, it sets to an inactivity-indicating value a field associated with it in a global status word. It also determines whether any field associated with any of the other threads indicates activity. If not, the thread concludes that the parallel-execution operation has been completed. Otherwise, it returns to searching for further tasks to perform.
摘要:
A multiprocessor, multi-program, stop-the-world garbage collection program is described. The system initially over partitions the root sources, and then iteratively employs static and dynamic work balancing. Garbage collection threads compete dynamically for the initial partitions. Work stealing double-ended queues, where contention is reduced, are described to provide dynamic load balancing among the threads. Contention is resolved by using atomic instructions. The heap is broken into a young and an old generation where parallel semi-space copying is used to collect the young generation and parallel mark-compacting the old generation. Speed and efficiency of collection is enhanced by use of card tables and linking objects, and overflow conditions are efficiently handled by linking using class pointers. A garbage collection termination employs a global status word.
摘要:
A multiprocessor, multi-program, stop-the-world garbage collection program is described. The system initially over partitions the root sources, and then iteratively employs static and dynamic work balancing. Garbage collection threads compete dynamically for the initial partitions. Work stealing double-ended queues, where contention is reduced, are described to provide dynamic load balancing among the threads. Contention is resolved by using atomic instructions. The heap is broken into a young and an old generation where parallel semi-space copying is used to collect the young generation and parallel mark-compacting the old generation. Speed and efficiency of collection is enhanced by use of card tables and linking objects, and overflow conditions are efficiently handled by linking using class pointers. A garbage collection termination employs a global status word.
摘要:
A system for managing transactions, including a first reference cell associated with a starting value for a first variable, a first thread having an outer atomic transaction including a first instruction to write a first value to the first variable, a second thread, executing in parallel with the first thread, having an inner atomic transaction including a second instruction to write a second value to the first variable, where the inner atomic transaction is nested within the outer atomic transaction, a first value node created by the outer atomic transaction and storing the first value in response to execution of the first instruction, and a second value node created by the inner atomic transaction, storing the second value in response to execution of the second instruction, and having a previous node pointer referencing the first value node.
摘要:
One embodiment of the present invention provides a system that facilitates performing operations on a lock-free double-ended queue (deque). This deque is implemented as a doubly-linked list of nodes formed into a ring, so that node pointers in one direction form an inner ring, and node pointers in the other direction form an outer ring. The deque has an inner hat, which points to a node next to the last occupied node along the inner ring, and an outer hat, which points to a node next to the last occupied node along the outer ring. The system uses a double compare-and-swap (DCAS) operation while performing pop and push operations onto either end of the deque, as well as growing and shrinking operations to change the number of nodes that are in the ring used by the deque.
摘要:
A computer system employing a plurality of concurrent threads to perform tasks that dynamically identify further similar tasks employs a double-ended queue (“deque”) to list the dynamically identified tasks. If a thread's deque runs out of tasks while other threads' deques have tasks remaining, the thread whose deque has become empty will remove one or more entries from another thread's deque and perform the tasks thereby identified. When a thread's deque becomes too full, it may allocate space for another deque, transfer entries from its existing deque, place an identifier of the existing deque into the new deque, and adopt the new deque as the one that it uses for storing and retrieving task identifiers. Alternatively, it may transfer some of the existing deque's entries into a newly allocated array and place an identifier of that array into the existing deque. The thread thereby deals with deque overflows without introducing additional synchronization requirements or restricting the deque's range of use.
摘要:
A remembered set for a memory heap region in a garbage-collected computer system is modified to classify reference locations stored therein by the heap region from which the references originate so that the number of references originating from a given region can be easily determined. If the number of remembered set entries for references from a second region to a first region reaches a predetermined threshold, the second region is constrained so that it will be collected at the same time as, or before, the first region. Then, all entries in the remembered set associated with the first region for references from the second region to the first region can be deleted, and no such entries need be entered in the future thereby reducing the size of that remembered set and the time required to scan it.