摘要:
A digital computer system comprises a precise exception handling processor and a control subsystem. The precise exception handling processor performs processing operations under control of instructions. The precise exception handling processor is constructed in accordance with a precise exception handling model, in which, if an exception condition is detected in connection with an instruction, the exception condition is processed in connection with the instruction. The precise exception handling processor further includes a pending exception indicator having a pending exception indication state and a no pending exception indication state. The control subsystem provides a series of instructions to the precise exception handling processor to facilitate emulation of at least one emulated program instruction. The emulated program instruction is constructed to be processed by a delayed exception handling processor which is constructed in accordance with a delayed exception handling model, in which if an exception is detected during processing of an instruction, the exception condition is processed in connection with a subsequent instruction. The series of instructions provided by the control subsystem in emulation of the emulated program instruction controls the precise exception handling processor to (i) determine whether the pending exception indicator is in the pending exception indication state and, if so, to invoke a routine to process the pending exception and condition the pending exception indicator to the no pending exception indication state (ii) perform processing operations in accordance with the emulated processing instruction; and (iii) if an exception condition is detected during the processing operations, to invoke an exception handler in accordance with the processor's precise exception handling model to condition the pending exception indicator to the pending exception indication state, so that the exception condition will be processed during processing operations for a subsequent emulated program instruction.
摘要:
A processor processes a segmented to linear virtual address conversion instruction to convert segmented virtual addresses in a segmented virtual address space to a linear virtual address in a linear virtual address space. The segmented virtual address space comprises a plurality of segments each identified by a segment identifier, each segment comprising at least one page identified by a page identifier. The linear virtual address space includes a plurality of pages each identified by a page identifier. In processing the segmented to linear virtual address conversion instruction, the processor uses a plurality of segmented to linear virtual address conversion descriptors, each associated with a page in the segmented virtual address space, each segmented to linear virtual address conversion descriptor identifying the page identifier of one of the pages in the linear virtual address space. The segmented to linear virtual address conversion instruction includes a segmented virtual address identifier in the segmented virtual address space. In processing the segmented to linear virtual address conversion instruction, the processor uses the segmented virtual address identifier in the segmented to linear virtual address conversion instruction to select one of the segmented to linear virtual address conversion descriptors. After selecting a segmented to linear virtual address conversion descriptor, the processor uses the page identifier of the linear virtual address space from the selected segmented to linear virtual address conversion descriptor and the segmented virtual address identifier in the segmented to linear virtual address conversion instruction in generating a virtual address in the linear virtual address space.
摘要:
The system and methods described herein may be used to detect and tolerate atomicity violations between concurrent code blocks and/or to generate code that is executable to detect and tolerate such violations. A compiler may transform program code in which the potential for atomicity violations exists into alternate code that tolerates these potential violations. For example, the compiler may inflate critical sections, transform non-critical sections into critical sections, or coalesce multiple critical sections into a single critical section. The techniques described herein may utilize an auxiliary lock state for locks on critical sections to enable detection of atomicity violations in program code by enabling the system to distinguish between program points at which lock acquisition and release operations appeared in the original program, and the points at which these operations actually occur when executing the transformed program code. Filtering and analysis techniques may reduce false positives induced by the transformations.
摘要:
A system and method is disclosed for fast lock acquisition and release in a lock-based software transactional memory system. The method includes determining that a group of shared memory areas are likely to be accessed together in one or more atomic memory transactions executed by one or more threads of a computer program in a transactional memory system. In response to determining this, the system associates the group of memory areas with a single software lock that is usable by the transactional memory system to coordinate concurrent transactional access to the group of memory areas by the threads of the computer program. Subsequently, a thread of the program may gain access to a plurality of the memory areas of the group by acquiring the single software lock.
摘要:
A method and system for acquiring multiple software locks in bulk is disclosed. When multiple locks need to be acquired, such as for atomic transactions in transactional memory systems, the disclosed techniques may be applied to consolidate computationally expensive memory barrier operations across the lock acquisitions. A system may acquire multiple locks in bulk, at least in part, by modifying values in one or more fields of multiple locks and by then performing a memory barrier operation to ensure that the modified values in the multiple locks are visible to other application threads. The technique may be repeated for locks that the system fails to acquire during earlier iterations until all required locks are acquired. The described technique may be applied to various scenarios including static and/or dynamic transactional locking protocols.
摘要:
One embodiment provides a system that facilitates the execution of a transaction for a program in a hardware-supported transactional memory system. During operation, the system records a failure state of the transaction during execution of the transaction using hardware transactional memory mechanisms. Next, the system detects a transaction failure associated with the transaction. Finally, the system provides an advice state associated with the recorded failure state to the program to facilitate a response to the transaction failure by the program.
摘要:
A partitioned ticket lock may control access to a shared resource, and may include a single ticket value field and multiple grant value fields. Each grant value may be the sole occupant of a respective cache line, an event count or sequencer instance, or a sub-lock. The number of grant values may be configurable and/or adaptable during runtime. To acquire the lock, a thread may obtain a value from the ticket value field using a fetch-and-increment type operation, and generate an identifier of a particular grant value field by applying a mathematical or logical function to the obtained ticket value. The thread may be granted the lock when the value of that grant value field matches the obtained ticket value. Releasing the lock may include computing a new ticket value, generating an identifier of another grant value field, and storing the new ticket value in the other grant value field.
摘要:
One embodiment provides a system that facilitates the execution of a transaction for a program in a hardware-supported transactional memory system. During operation, the system records a misspeculation indicator of the transaction during execution of the transaction using hardware transactional memory mechanisms. Next, the system detects a transaction failure associated with the transaction. Finally, the system provides the recorded misspeculation indicator to the program to facilitate a response to the transaction failure by the program.
摘要:
A method for providing applications with a current time value includes receiving a trap for an application to access a time memory page, creating, in a memory map corresponding to the application, a mapping between an address space of the application and the time memory page in response to the trap, accessing, based on the trap, a hardware clock to obtain a time value, and updating the time memory page with the time value. The application reads the time value from the time memory page using the memory map.
摘要:
Adaptive modifications of spinning and blocking behavior in spin-then-block mutual exclusion include limiting spinning time to no more than the duration of a context switch. Also, the frequency of spinning versus blocking is limited to a desired amount based on the success rate of recent spin attempts. As an alternative, spinning is bypassed if spinning is unlikely to be successful because the owner is not progressing toward releasing the shared resource, as might occur if the owner is blocked or spinning itself. In another aspect, the duration of spinning is generally limited, but longer spinning is permitted if no other threads are ready to utilize the processor. In another aspect, if the owner of a shared resource is ready to be executed, a thread attempting to acquire ownership performs a “directed yield” of the remainder of its processing quantum to the other thread, and execution of the acquiring thread is suspended.