摘要:
Cache logic associated with a respective one of multiple processing threads executing in parallel updates corresponding data fields of a cache to uniquely mark its contents. The marked contents represent a respective read set for a transaction. For example, at an outset of executing a transaction, a respective processing thread chooses a data value to mark contents of the cache used for producing a transaction outcome for the processing thread. Upon each read of shared data from main memory, the cache stores a copy of the data and marks it as being used during execution of the processing thread. If uniquely marked contents of a respective cache line happen to be displaced (e.g., overwritten) during execution of a processing thread, then the transaction is aborted (rather than being committed to main memory) because there is a possibility that another transaction overwrote a shared data value used during the respective transaction.
摘要:
A method, apparatus and computer program product for providing page-protection based memory access barrier traps is presented. A value for a user-mode bit (u-bit) is computed for each extant virtual page in an address space, the u-bit indicative that an object on the virtual page is being moved by a Garbage Collector process. An instruction is executed which causes an access protection fault. The state of the u-bit for the virtual page associated with the access protection fault is consulted when the access protection fault is encountered. Additionally, the access protection fault is translated into a user-trap (utrap) and the utrap is serviced when the u-bit is set.
摘要:
A method comprises associating a plurality of locks with a data object accessed concurrently by a plurality of threads, where each lock corresponds to a respective partition of the object. The method includes using a first non-blocking transaction (such as a Hardware Transactional-Memory (HTM) transaction) to attempt to complete a programmer-specified transaction. The first non-blocking transaction may access one or more of the locks but may not actually acquire any of the locks. In response to an indication that the first non-blocking transaction failed to complete, the method may include acquiring a set of locks in another non-blocking transaction, where the set of locks corresponds to a set of partitions expected to be accessed in the programmer-specified transaction. If the set of locks is acquired, the method may include performing the memory access operations of the programmer-specified transaction, and releasing the set of locks.
摘要:
The present disclosure describes a unique way for each of multiple processes to operate in parallel and use the same shared data without causing corruption to the shared data. For example, during a commit phase, a corresponding transaction can attempt to increment a globally accessible version information variable and store a current value of the globally accessible version information variable for updating version information associated with modified data regardless of whether an associated attempt by the corresponding transaction to modify the globally accessible version information variable was successful. As an alternative mode, a corresponding transaction can merely read and store a current value of the globally accessible version information variable without attempting to update the globally accessible version information variable before such use. In yet another application, a parallel processing environment implements a combination of both aforementioned modes depending on a self-abort rate of the transaction.
摘要:
For each of multiple processes executing in parallel, as long as corresponding version information associated with a respective set of one or more shared variables used for computational purposes has not changed during execution of a respective transaction, results of the respective transaction can be globally committed to memory without causing data corruption. If version information associated with one or more respective shared variables (used to produce the transaction results) happens to change during a process of generating respective results, then a respective process can identify that another process modified the one or more respective shared variables during execution and that its transaction results should not be committed to memory. In this latter case, the transaction repeats itself until it is able to commit respective results without causing data corruption.
摘要:
A computer system includes multiple processing threads that execute in parallel. The multiple processing threads have access to a global environment including different types of metadata enabling the processing threads to carry out simultaneous execution depending on a currently selected type of lock mode. A mode controller monitoring the processing threads initiates switching from one type of lock mode to another depending on current operating conditions such as an amount of contention amongst the multiple processing threads to modify the shared data. The mode controller can switch from one lock mode another regardless of whether any of the multiple processes are in the midst of executing a respective transaction. A most efficient lock mode can be selected to carry out the parallel transactions. In certain cases, switching of lock modes causes one or more of the processing threads to abort and retry a respective transaction according to the new mode.
摘要:
Cache logic associated with a respective one of multiple processing threads executing in parallel updates corresponding data fields of a cache to uniquely mark its contents. The marked contents represent a respective read set for a transaction. For example, at an outset of executing a transaction, a respective processing thread chooses a data value to mark contents of the cache used for producing a transaction outcome for the processing thread. Upon each read of shared data from main memory, the cache stores a copy of the data and marks it as being used during execution of the processing thread. If uniquely marked contents of a respective cache line happen to be displaced (e.g., overwritten) during execution of a processing thread, then the transaction is aborted (rather than being committed to main memory) because there is a possibility that another transaction overwrote a shared data value used during the respective transaction.
摘要:
A technique for accessing a shared resource of a computerized system involves running a first portion of a first thread within the computerized system, the first portion (i) requesting a lock on the shared resource and (ii) directing the computerized system to make operations of a second thread visible in a correct order. The technique further involves making operations of the second thread visible in the correct order in response to the first portion of the first thread running within the computerized system, and running a second portion of the first thread within the computerized system to determine whether the first thread has obtained the lock on the shared resource. Such a technique alleviates the need for using a MEMBAR instruction in the second thread.
摘要:
An operating system call control subsystem is disclosed for use in a computer that includes a processor for processing a program, the program instructions of an operating system call instruction type identifying one of a plurality of types of operating system calls, each type of operating system call being associable with an operating system call type identifier value within a predetermined range of values. The operating system call control subsystem comprises a crossover table, an operating system call instruction type address resolution module, and an operating system call instruction type processing module. The crossover table has a number of entries corresponding to a predetermined fraction of the predetermined range, each entry in the crossover table having an instruction for enabling the processor to save a value corresponding to an offset of the entry into the crossover table. The operating system call instruction type address resolution module provides the instructions of the operating system call instruction type with respective target addresses that include an operating system call set identifier in a set of operating system call set identifiers, the number of operating system call set identifiers multiplied by the number of crossover table entries corresponding to the predetermined range and an offset value corresponding to an offset to an entry into the crossover table. The operating system call instruction type processing module, in response to the processor processing an instruction of the operating system call instruction type, (a) saves the operating system call set identifier from the target address, (b) selects one of the entries in the crossover table using the offset value of the target address, (c) processes the instruction from the selected entry of the crossover table to save the value corresponding to the offset of the selected entry in the crossover table, and (d) generates the operating system call type identifier value in connection with the saved operating system call set identifier and the saved value corresponding to the offset of the selected entry in the crossover table.
摘要:
A computer system may recognize a busy-wait loop in program instructions at compile time and/or may recognize busy-wait looping behavior during execution of program instructions. The system may recognize that an exit condition for a busy-wait loop is specified by a conditional branch type instruction in the program instructions. In response to identifying the loop and the conditional branch type instruction that specifies its exit condition, the system may influence or override a prediction made by a dynamic branch predictor, resulting in a prediction that the exit condition will be met and that the loop will be exited regardless of any observed branch behavior for the conditional branch type instruction. The looping instructions may implement waiting for an inter-thread communication event to occur or for a lock to become available. When the exit condition is met, the loop may be exited without incurring a misprediction delay.