Abstract:
Different processor architectures are described to evaluate and track dependencies required by instructions. The processors may hold or queue instructions that require output of other instructions until required data and resources are available which may remove the requirement of NOPs in the instruction memory to resolve dependencies and pipeline hazards. The processor may divide instruction data into bundles for parallel execution and provide speculative execution. The processor may include various components to implement an evaluation unit, execution unit and termination unit.
Abstract:
Techniques for improving execution of a lock instruction are provided herein. A lock instruction and younger instructions are allowed to speculatively retire prior to the store portion of the lock instruction committing its value to memory. These instructions thus do not have to wait for the lock instruction to complete before retiring. In the event that the processor detects a violation of the atomic or fencing properties of the lock instruction prior to committing the value of the lock instruction, the processor rolls back state and executes the lock instruction in a slow mode in which younger instructions are not allowed to retire until the stored value of the lock instruction is committed. Speculative retirement of these instructions results in increased processing speed, as instructions no longer need to wait to retire after execution of a lock instruction.
Abstract:
A processor of an aspect includes a decode unit to decode a persistent store fence instruction. The processor also includes a memory subsystem module coupled with the decode unit. The memory subsystem module, in response to the persistent store fence instruction, is to ensure that a given data corresponding to the persistent store fence instruction is stored persistently in a persistent storage before data of all subsequent store instructions is stored persistently in the persistent storage. The subsequent store instructions occur after the persistent store fence instruction in original program order. Other processors, methods, systems, and articles of manufacture are also disclosed.
Abstract:
A method and system uses exceptions for code specialization in a system that supports transactions. The method and system includes inserting one or more branchless instructions into a sequence of computer instructions. The branchless instructions include one or more instructions that are executable if a commonly occurring condition is satisfied and include one or more instructions that are configured to raise an exception if the commonly occurring condition is not satisfied.
Abstract:
Apparatus and methods provide early access of instructions. A fetch queue is coupled to an instruction cache and configured to store a mix of processor instructions for a first processor and coprocessor instructions for a second processor. A coprocessor instruction selector is coupled to the fetch queue and configured to copy coprocessor instructions from the fetch queue. A queue is coupled to the coprocessor instruction selector and from which coprocessor instructions are accessed for execution before the coprocessor instruction is issued to the first processor. Execution of the copied coprocessor instruction is started in the coprocessor before the coprocessor instruction is issued to a processor. The execution of the copied coprocessor instruction is completed based on information received from the processor after the coprocessor instruction has been issued to the processor.
Abstract:
A system and method are disclosed wherein a processor of a plurality of processors coupled to shared memory, is configured to initiate execution of a section of code according to a first transactional mode of the processor. The processor is configured to execute a plurality of protected memory access operations to the shared memory within the section of code as a single atomic transaction with respect to the plurality of processors. The processor is further configured to initiate, within the section of code, execution of a subsection of the section of code according to a second transactional mode of the processor, wherein the first and second transactional modes are each associated with respective recovery actions that the processor is configured to perform in response to detecting an abort condition.