Abstract:
A cache memory system is provided that uses multi-bit Error Correcting Code (ECC) with a low storage and complexity overhead. The cache memory system can be operated at very low idle power, without dramatically increasing transition latency to and from an idle power state due to loss of state.
Abstract:
An apparatus is described that includes a memory controller to interface to a multi-level system memory. The memory controller includes least recently used (LRU) circuitry to keep track of least recently used cache lines kept in a higher level of the multi-level system memory. The memory controller also includes idle time predictor circuitry to predict idle times of a lower level of the multi-level system memory. The memory controller is to write one or more lesser used cache lines from the higher level of the multi-level system memory to the lower level of the multi-level system memory in response to the idle time predictor circuitry indicating that an observed idle time of the lower level of the multi-level system memory is expected to be long enough to accommodate the write of the one or more lesser used cache lines from the higher level of the multi-level system memory to the lower level of the multi-level system memory.
Abstract:
Examples may include techniques for adaptive compression for data stored in a memory device. The techniques include monitoring a data access pattern to a file or a block for data stored in the memory device and determining a data compression action based on, at least in part, the monitored data access patterns for the file or the block and on an assessed relationship of the file or the block with other files or other blocks. The data compression action including compressing data accessed via the file or the block, decompressing compressed data accessed via the file or the block or no compression action for data accessed via the file or the block.
Abstract:
A method is described that includes creating a first data pattern access record for a region of system memory in response to a cache miss at a host side cache for a first memory access request. The first memory access request specifies an address within the region of system memory. The method includes fetching a previously existing data access pattern record for the region from the system memory in response to the cache miss. The previously existing data access pattern record identifies blocks of data within the region that have been previously accessed. The method includes pre-fetching the blocks from the system memory and storing the blocks in the cache.
Abstract:
Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.
Abstract:
In an embodiment, a processor includes multiple tiles, each including a core and a tile cache hierarchy. This tile cache hierarchy includes a first level cache, a mid-level cache (MLC) and a last level cache (LLC), and each of these caches is private to the tile. A controller coupled to the tiles includes a cache power control logic to receive utilization information regarding the core and the tile cache hierarchy of a tile and to cause the LLC of the tile to be independently power gated, based at least in part on this information. Other embodiments are described and claimed.