摘要:
Disclosed is an apparatus and method generally related to controlling a multimedia extension control and status register (MXCSR). A processor core may include a floating point unit (FPU) to perform arithmetic functions; and a multimedia extension control register (MXCR) to provide control bits to the FPU. Further an optimizer may be used to select a speculative multimedia extension status register (SPEC_MXSR) from a plurality of SPEC_MXSRs to update a multimedia extension status register (MXSR) based upon an instruction.
摘要:
Techniques and mechanisms for a processor to determine whether a commit action is to be performed. In an embodiment, a processor performs operations to determine whether a commit instruction is for contingent performance of a commit action. In another embodiment, one or more conditions of processor state are evaluated in response to determining that the commit instruction is for contingent performance of the commit action, where the evaluation is performed to determine whether the commit action indicated by the commit instruction is to be performed.
摘要:
Disclosed is an apparatus and method to manage instruction cache prefetching from an instruction cache. A processor may comprise: a prefetch engine; a branch prediction engine to predict the outcome of a branch; and dynamic optimizer. The dynamic optimizer may be used to control: identifying common instruction cache misses and inserting a prefetch instruction from the prefetch engine to the instruction cache.
摘要:
Disclosed is an apparatus and method to manage instruction cache prefetching from an instruction cache. A processor may comprise: a prefetch engine; a branch prediction engine to predict the outcome of a branch; and dynamic optimizer. The dynamic optimizer may be used to control: indentifying common instruction cache misses and inserting a prefetch instruction from the prefetch engine to the instruction cache.
摘要:
A computer-readable storage medium, method and system for optimization-level aware branch prediction is described. A gear level is assigned to a set of application instructions that have been optimized. The gear level is also stored in a register of a branch prediction unit of a processor. Branch prediction is then performed by the processor based upon the gear level.
摘要:
A combination of hardware and software collect profile data for asynchronous events, at code region granularity. An exemplary embodiment is directed to collecting metrics for prefetching events, which are asynchronous in nature. Instructions that belong to a code region are identified using one of several alternative techniques, causing a profile bit to be set for the instruction, as a marker. Each line of a data block that is prefetched is similarly marked. Events corresponding to the profile data being collected and resulting from instructions within the code region are then identified. Each time that one of the different types of events is identified, a corresponding counter is incremented. Following execution of the instructions within the code region, the profile data accumulated in the counters are collected, and the counters are reset for use with a new code region.
摘要:
A combination of hardware and software collect profile data for asynchronous events, at code region granularity. An exemplary embodiment is directed to collecting metrics for prefetching events, which are asynchronous in nature. Instructions that belong to a code region are identified using one of several alternative techniques, causing a profile bit to be set for the instruction, as a marker. Each line of a data block that is prefetched is similarly marked. Events corresponding to the profile data being collected and resulting from instructions within the code region are then identified. Each time that one of the different types of events is identified, a corresponding counter is incremented. Following execution of the instructions within the code region, the profile data accumulated in the counters are collected, and the counters are reset for use with a new code region.
摘要:
A processor includes a resource scheduler, a dispatcher, and a memory execution unit. The memory execution unit includes logic to identify an executed, unretired store operation in a memory ordered buffer, determine that the store operation is speculative, determine whether an associated cache line in a data cache is non-speculative, and determine whether to block a write of the store operation results to the data cache based upon the determination that the store operation is speculative and a determination that the associated cache line is non-speculative.