Abstract:
A method and apparatus for preserving memory ordering in a cache coherent link based interconnect in light of partial and non-coherent memory accesses is herein described. In one embodiment, partial memory accesses, such as a partial read, is implemented utilizing a Read Invalidate and/or Snoop Invalidate message. When a peer node receives a Snoop Invalidate message referencing data from a requesting node, the peer node is to invalidate a cache line associated with the data and is not to directly forward the data to the requesting node. In one embodiment, when the peer node holds the referenced cache line in a Modified coherency state, in response to receiving the Snoop Invalidate message, the peer node is to writeback the data to a home node associated with the data.
Abstract:
A processor includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, and a binary translator. The binary translator includes circuitry to identify a redundant store in the instruction stream, mark the start and end of a region of the instruction stream with the redundant store, remove the redundant store, and store an amended instruction stream with the redundant store removed.
Abstract:
A processor of an aspect includes a decode unit to decode a prior instruction that is to have at least a first context, and a subsequent instruction. The subsequent instruction is to be after the prior instruction in original program order. The decode unit is to use the first context of the prior instruction to determine a second context for the subsequent instruction. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the subsequent instruction based at least in part on the second context. Other processors, methods, systems, and machine-readable medium are also disclosed.
Abstract:
A method and apparatus for preserving memory ordering in a cache coherent link based interconnect in light of partial and non-coherent memory accesses is herein described. In one embodiment, partial memory accesses, such as a partial read, is implemented utilizing a Read Invalidate and/or Snoop Invalidate message. When a peer node receives a Snoop Invalidate message referencing data from a requesting node, the peer node is to invalidate a cache line associated with the data and is not to directly forward the data to the requesting node. In one embodiment, when the peer node holds the referenced cache line in a Modified coherency state, in response to receiving the Snoop Invalidate message, the peer node is to writeback the data to a home node associated with the data.
Abstract:
A method and apparatus for preserving memory ordering in a cache coherent link based interconnect in light of partial and non-coherent memory accesses is herein described. In one embodiment, partial memory accesses, such as a partial read, is implemented utilizing a Read Invalidate and/or Snoop Invalidate message. When a peer node receives a Snoop Invalidate message referencing data from a requesting node, the peer node is to invalidate a cache line associated with the data and is not to directly forward the data to the requesting node. In one embodiment, when the peer node holds the referenced cache line in a Modified coherency state, in response to receiving the Snoop Invalidate message, the peer node is to writeback the data to a home node associated with the data.
Abstract:
A processor includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, and a binary translator. The binary translator includes circuitry to identify a redundant store in the instruction stream, mark the start and end of a region of the instruction stream with the redundant store, remove the redundant store, and store an amended instruction stream with the redundant store removed.
Abstract:
A processor of an aspect includes a decode unit to decode a prior instruction that is to have at least a first context, and a subsequent instruction. The subsequent instruction is to be after the prior instruction in original program order. The decode unit is to use the first context of the prior instruction to determine a second context for the subsequent instruction. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the subsequent instruction based at least in part on the second context. Other processors, methods, systems, and machine-readable medium are also disclosed.