摘要:
A processor for processing loop instructions can include an instruction reorder structure and a loop processing controller. The instruction reorder structure is configured to store decoded instructions according to program order and issue the decoded instructions for execution out of program order. The loop processing controller is configured to detect a loop in the decoded instructions stored in the instruction reorder structure and cause the instruction reorder structure to reissue the decoded instructions that form the loop for re-execution.
摘要:
Aspects of the disclosure provide methods for cache efficiency. A method for cache efficiency can include storing data in a buffer entry in association with a cache array in response to a first store instruction that hits the cache array before the first store instruction is committed. Further, when a dependent load instruction is subsequent to the first store instruction, the method can include providing the data from the buffer entry in response to the first dependent load instruction. When a second store instruction overlaps an address of the first store instruction, the method can include coalescing data of the second store instruction in the buffer entry before the second store instruction is committed. When the second store instruction is followed by a second dependent load instruction, the method can include providing the coalesced data from the buffer entry in response to the second dependent load instruction.
摘要:
A programmable cache and cache access protocol that can be dynamically optimized with respect to either power consumption or performance based on a monitored performance of the cache. A monitoring unit monitors cache misses, load use penalty, and/or other performance parameter, and compares the monitored values against a set of one or more predetermined thresholds. Based on the comparison results, a cache controller configures the programmable cache to operate in a parallel mode, to increase cache performance at the cost of greater power consumption, or in a serial mode, to conserve power at the cost of unnecessary performance. A banked cache memory that supports aligned and unaligned instruction fetches using a banked access strategy, and a cache access controller that includes a prefetch capability are also described.
摘要:
A processor includes a first level of cache memory and a first set of instructions configured to implement a first cache coherency protocol. The processor also includes a second set of instructions configured to implement a second cache coherency protocol and a cache coherency protocol selector having at least two choice-states. The processor further includes a cache coherency implementer configured to implement the first cache coherency protocol or the second cache coherency with respect to the first level of cache memory based on a selected choice-state of the cache coherency protocol selector.
摘要:
Devices, systems, methods, and other embodiments associated with a cache memory are described. In one embodiment, a cache tag array includes tag banks. The cache memory further includes a bank selector configured to receive an address and to apply a hash function that maps the address to one of the tag banks.
摘要:
Set-associative caches having corresponding methods and computer programs comprise: a data cache to provide a plurality of cache lines based on a set index of a virtual address, wherein each of the cache lines corresponds to one of a plurality of ways of the set-associative cache; a translation lookaside buffer to provide one of a plurality of way selections based on the set index of the virtual address and a virtual tag of the virtual address, wherein each of the way selections corresponds to one of the ways of the set-associative cache; and a way multiplexer to select one of the cache lines provided by the data cache based on the one of the plurality of way selections.