Abstract:
Systems, methods, and devices for original code emulation for performance monitoring is provided. A system may memory to store instructions. A processor may implement an instruction converter in hardware or software to convert the instructions to translated code. Specifically, the instruction converter receives the instructions and translates the stored instructions into the translated code that includes one or more indexed instructions. The one or more indexed instructions include a field indicating a number of branches in the stored instructions that are taken in the translated code.
Abstract:
A processor of an aspect includes a decode unit to decode a prior instruction that is to have at least a first context, and a subsequent instruction. The subsequent instruction is to be after the prior instruction in original program order. The decode unit is to use the first context of the prior instruction to determine a second context for the subsequent instruction. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the subsequent instruction based at least in part on the second context. Other processors, methods, systems, and machine-readable medium are also disclosed.
Abstract:
Methods and apparatuses relating to a fusion manager to fuse instructions are described. In one embodiment, a hardware processor includes a hardware binary translator to translate an instruction stream into a translated instruction stream, a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, a hardware decode unit to decode the single fused instruction into a decoded, single fused instruction, and a hardware execution unit to execute the decoded, single fused instruction.
Abstract:
A method and apparatus are described for efficient register reclamation. For example, one embodiment of an apparatus comprises: single usage detection and tagging logic to examine a sequence of instructions to detect logical registers used by the sequence of instructions that have a single use and to tag an instruction as a single usage instruction if the instruction is a consumer of a logical register that has a single use; an allocator to allocate processor resources to execute the sequence of instructions, the processor resources including physical registers mapped to logical registers to execute the sequence of instructions; and register reclamation logic to free up a logical to physical mapping of a single use register in response to detecting the tag provided by the instruction tagging logic.
Abstract:
A processor of an aspect includes a decode unit to decode a prior instruction that is to have at least a first context, and a subsequent instruction. The subsequent instruction is to be after the prior instruction in original program order. The decode unit is to use the first context of the prior instruction to determine a second context for the subsequent instruction. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the subsequent instruction based at least in part on the second context. Other processors, methods, systems, and machine-readable medium are also disclosed.
Abstract:
A processing system implementing techniques for binary translation support using processor instruction prefixes is provided. In one embodiment, the processing system includes a register bank having a plurality of registers to store data for use in executing instructions and a processor core coupled to the register bank. An instruction to be executed by the processor core is received. The instruction is associated with a binary translator operation to translate input instruction sequences to output instruction sequences. An opcode prefix referencing an extended register of the plurality of registers to be used during the binary translator operation. The extended register preserves a source register value of the plurality of registers.
Abstract:
Methods and apparatuses relating to assigning a logical thread to a physical thread. In one embodiment, an apparatus includes a data storage device that stores code that when executed by a hardware processor causes the hardware processor to perform the following: translating an instruction into a translated instruction, assigning a logical thread for the translated instruction, and providing a thread map hint for the translated instruction; and a hardware scheduler to assign a physical thread of the hardware processor to execute the logical thread based on the thread map hint.