摘要:
A method, apparatus, and system for implementing off-chip cache memory in dual-use static random access memory (SRAM) memory for network processors. An off-chip SRAM memory store is partitioned into a resizable cache region and general-purpose use region (i.e., conventional SRAM use). The cache region is used to store cached data corresponding to portions of data contained in a second off-chip memory store, such as a dynamic RAM (DRAM) memory store or an alternative type of memory store, such as a Rambus DRAM (RDRAM) memory store. An on-chip cache management controller is integrated on the network processor. Various cache management schemes are disclosed, including hardware-based cache tag arrays, memory-based cache tag arrays, content-addressable memory (CAM)-based cache management, and memory address-to-cache line lookup schemes. Under one scheme, multiple network processors are enabled to access shared SRAM and shared DRAM, wherein a portion of the shared SRAM is used as a cache for the shared DRAM.
摘要:
In general, in one aspect, the disclosure describes a method that includes generating multiple cache line accesses to multiple respective cache lines of a cache as required to satisfy an access to data specified by a single instruction of a processing element specifying an access to data.
摘要:
In general, in one aspect, the disclosure describes a method that includes providing a memory access instruction of a processing element's instruction set including multiple parameters. The parameters include at least one address and a token specifying whether the instruction should cause data retrieved from memory in response to the memory access instruction to be unavailable to a subsequent memory access instruction via a cache
摘要:
Method and apparatus to enable I/O agents to perform atomic operations in shared, coherent memory spaces. The apparatus includes an arbitration unit, a host interface unit, and a memory interface unit. The arbitration unit provides an interface to one or more I/O agents that issue atomic transactions to access and/or modify data stored in a shared memory space accessed via the memory interface unit. The host interface unit interfaces to a front-side bus (FSB) to which one or more processors may be coupled. In response to an atomic transaction issued by an I/O agent, the transaction is forked into two interdependent processes. Under one process, an inbound write transaction is injected into the host interface unit, which then drives the FSB to cause the processor(s) to perform a cache snoop. At the same time, an inbound read transaction is injected into the memory interface unit, which retrieves a copy of the data from the shared memory space. If the cache snoop identifies a modified cache line, a copy of that cache line is returned to the I/O agent; otherwise, the copy of the data retrieved from the shared memory space is returned.
摘要:
A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
摘要:
A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
摘要:
A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
摘要:
Instruction-assisted cache management for efficient use of cache and memory. Hints (e.g., modifiers) are added to read and write memory access instructions to identify the memory access is for temporal data. In view of such hints, alternative cache policy and allocation policies are implemented that minimize cache and memory access. Under one policy, a write cache miss may result in a write of data to a partial cache line without a memory read/write cycle to fill the remainder of the line. Under another policy, a read cache miss may result in a read from memory without allocating or writing the read data to a cache line. A cache line soft-lock mechanism is also disclosed, wherein cache lines may be selectably soft locked to indicate preference for keeping those cache lines over non-locked lines.