摘要:
Exemplary embodiments include a method for updating an Cache LRU tree including: receiving a new cache line; traversing the Cache LRU tree, the Cache LRU tree including a plurality of nodes; biasing a selection the victim line toward those lines with relatively low priorities from the plurality of lines; and replacing a cache line with a relatively low priority with the new cache line.
摘要:
Exemplary embodiments include a method for updating an Cache LRU tree including: receiving a new cache line; traversing the Cache LRU tree, the Cache LRU tree including a plurality of nodes; biasing a selection the victim line toward those lines with relatively low priorities from the plurality of lines; and replacing a cache line with a relatively low priority with the new cache line.
摘要:
An aspect of the present invention improves the accuracy of measuring processor utilization of multi-threaded cores by providing a calibration facility that derives utilization in the context of the overall dynamic operating state of the core by assigning weights to idle threads and assigning weights to run threads, depending on the status of the core. From previous chip designs it has been established in a Simultaneous Multi Thread (SMT) core that not all idle cycles in a hardware thread can be equally converted into useful work. Competition for core resources reduces the conversion efficiency of one thread's idle cycles when any other thread is running on the same core.
摘要:
An aspect of the present invention improves the accuracy of measuring processor utilization of multi-threaded cores by providing a calibration facility that derives utilization in the context of the overall dynamic operating state of the core by assigning weights to idle threads and assigning weights to run threads, depending on the status of the core. From previous chip designs it has been established in a Simultaneous Multi Thread (SMT) core that not all idle cycles in a hardware thread can be equally converted into useful work. Competition for core resources reduces the conversion efficiency of one thread's idle cycles when any other thread is running on the same core.
摘要:
At a first cache memory affiliated with a first processor core, an exclusive memory access operation is received via an interconnect fabric coupling the first cache memory to second and third cache memories respectively affiliated with second and third processor cores. The exclusive memory access operation specifies a target address. In response to receipt of the exclusive memory access operation, the first cache memory detects presence or absence of a source indication indicating that the exclusive memory access operation originated from the second cache memory to which the first cache memory is coupled by a private communication network to which the third cache memory is not coupled. In response to detecting presence of the source indication, a coherency state field of the first cache memory that is associated with the target address is updated to a first data-invalid state. In response to detecting absence of the source indication, the coherency state field of the first cache memory is updated to a different second data-invalid state.
摘要:
A mechanism is provided for dynamic cache allocation using a cache hit rate. A first cache hit rate is monitored in a first subset utilizing a first allocation policy of N sets of a lower level cache. A second cache hit rate is also monitored in a second subset utilizing a second allocation policy different from the first allocation policy of the N sets of the lower level cache. A periodic comparison of the first cache hit rate to the second cache hit rate is made to identify a third allocation policy for a third subset of the N-sets of the lower level cache. The third allocation policy for the third subset is then periodically adjusted to at least one of the first allocation policy or the second allocation policy based on the comparison of the first cache hit rate to the second cache hit rate.
摘要:
A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted.
摘要:
A mechanism is provided for dynamic cache allocation using a cache hit rate. A first cache hit rate is monitored in a first subset utilizing a first allocation policy of N sets of a lower level cache. A second cache hit rate is also monitored in a second subset utilizing a second allocation policy different from the first allocation policy of the N sets of the lower level cache. A periodic comparison of the first cache hit rate to the second cache hit rate is made to identify a third allocation policy for a third subset of the N-sets of the lower level cache. The third allocation policy for the third subset is then periodically adjusted to at least one of the first allocation policy or the second allocation policy based on the comparison of the first cache hit rate to the second cache hit rate.
摘要:
A mechanism is provided for weighted history allocation prediction. For each member in a plurality of members in a lower level cache, an associated reference counter is initialized to an initial value based on an operation type that caused data to be allocated to a member location of the member. For each access to the member in the lower level cache, the associated reference counter is incremented. Responsive to a new allocation of data to the lower level cache and responsive to the new allocation of data requiring the victimization of another member in the lower level cache, a member of the lower level cache is identified that has a lowest reference count value in its associated reference counter. The member with the lowest reference count value in its associated reference counter is then evicted.
摘要:
We present a “directory extension” (hereinafter “DX”) to aid in prefetching between proximate levels in a cache hierarchy. The DX may maintain (1) a list of pages which contains recently ejected lines from a given level in the cache hierarchy, and (2) for each page in this list, the identity of a set of ejected lines, provided these lines are prefetchable from, for example, the next level of the cache hierarchy. Given a cache fault to a line within a page in this list, other lines from this page may then be prefetched without the substantial overhead to directory lookup which would otherwise be required.