Abstract:
In one embodiment, a processor includes logic, responsive to a first instruction, to perform an operation on a first source operand and a second source operand associated with the first instruction and write a result of the operation to a destination location comprising a third source operand. The write may be a partial write of the destination location to maintain an unmodified portion of the third source operand. Other embodiments are described and claimed.
Abstract:
A processing device includes a store instruction identification unit to identify a pair of store instructions among a plurality of instructions in an instruction queue. The pair of store instructions include a first store instruction and a second store instruction. The first data of the first store instruction corresponds to a first memory region adjacent to a second memory region, and a second data of the second store instruction corresponds to the second memory region. The processing device to include a store instruction fusion unit to fuse the first store instruction with the second store instruction resulting in a fused store instruction.
Abstract:
A processor includes a front end and a scheduler. The front end includes circuitry to determine whether to apply an acyclical or cyclical thread assignment scheme to code received at the processor, and to, based upon a determined thread assignment scheme, assign code to a static logical thread and to a rotating logical thread. The scheduler includes circuitry to assign the static logical thread to the same physical thread upon a subsequent control flow execution of the static logical thread, and to assign the rotating logical thread to different physical threads upon different executions of instructions in the rotating logical thread.
Abstract:
A processor includes a front end and a scheduler. The front end includes circuitry to determine whether to apply an acyclical or cyclical thread assignment scheme to code received at the processor, and to, based upon a determined thread assignment scheme, assign code to a static logical thread and to a rotating logical thread. The scheduler includes circuitry to assign the static logical thread to the same physical thread upon a subsequent control flow execution of the static logical thread, and to assign the rotating logical thread to different physical threads upon different executions of instructions in the rotating logical thread.
Abstract:
A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.
Abstract:
A processor includes a front end to receive an instruction. The processor also includes a core to execute the instruction. The core includes logic to execute a base function of the instruction to yield a result, generate a predicate value of a comparison of the result based upon a predication setting in the instruction, and set the predicate value in a register. The processor also includes a retirement unit to retire the instruction.
Abstract:
Methods and apparatuses relating to assigning a logical thread to a physical thread. In one embodiment, an apparatus includes a data storage device that stores code that when executed by a hardware processor causes the hardware processor to perform the following: translating an instruction into a translated instruction, assigning a logical thread for the translated instruction, and providing a thread map hint for the translated instruction; and a hardware scheduler to assign a physical thread of the hardware processor to execute the logical thread based on the thread map hint.
Abstract:
A processor includes a processor core to execute a first translated instruction translated from a first instruction stored in first page of a memory. The processor also includes a translation indicator agent (XTBA) to store a first translation indicator that is read from a physical map (PhysMap) in the memory. In an embodiment, the first translation indicator is to indicate whether the first page has been modified after the first instruction is translated. Other embodiments are described as claimed.
Abstract:
In one embodiment, a processor can operate in multiple modes, including a direct execution mode and an emulation execution mode. More specifically, the processor may operate in a partial emulation model in which source instruction set architecture (ISA) instructions are directly handled in the direct execution mode and translated code generated by an emulation engine is handled in the emulation execution mode. Embodiments may also provide for efficient transitions between the modes using information that can be stored in one or more storages of the processor and elsewhere in a system. Other embodiments are described and claimed.
Abstract:
A processor includes a front end to receive an instruction. The processor also includes a core to execute the instruction. The core includes logic to execute a base function of the instruction to yield a result, generate a predicate value of a comparison of the result based upon a predication setting in the instruction, and set the predicate value in a register. The processor also includes a retirement unit to retire the instruction.