Abstract:
A stream of data is accessed from a memory system using a stream of addresses generated in a first mode of operating a streaming engine in response to executing a first stream instruction. A block cache management operation is performed on a cache in the memory using a block of addresses generated in a second mode of operating the streaming engine in response to executing a second stream instruction.
Abstract:
This invention optimizes DMA writes to directly addressable level two memory that is cached in level one and the line is valid and dirty. When the level two controller detects that a line is valid and dirty in level one, the level two memory need not update its copy of the data. Level one memory will replace the level two copy with a victim writeback at a future time. Thus the level two memory need not store write a copy. This limits the number of DMA writes to level two directly addressable memory and thus improves performance and minimizes dynamic power. This also frees the level two memory for other master/requestors.
Abstract:
A method to eliminate the delay of a block invalidate operation in a multi CPU environment by overlapping the block invalidate operation with normal CPU accesses, thus making the delay transparent. A range check is performed on each CPU access while a block invalidate operation is in progress, and an access that maps to within the address range of the block invalidate operation is treated as a cache miss to ensure that the requesting CPU will receive valid data.
Abstract:
Traffic output from a cache write-miss buffer is controlled by determining whether a predetermined condition is satisfied, and outputting an oldest entry from the buffer only in response to a determination that the predetermined condition is satisfied. Posting of a new entry to the buffer is insufficient to satisfy the predetermined condition.
Abstract:
In described examples, a processor system includes a processor core generating memory transactions, a lower level cache memory with a lower memory controller, and a higher level cache memory with a higher memory controller having a memory pipeline. The higher memory controller is connected to the lower memory controller by a bypass path that skips the memory pipeline. The higher memory controller: determines whether a memory transaction is a bypass write, which is a memory write request indicated not to result in a corresponding write being directed to the higher level cache memory; if the memory transaction is determined a bypass write, determines whether a memory transaction that prevents passing is in the memory pipeline; and if no transaction that prevents passing is determined to be in the memory pipeline, sends the memory transaction to the lower memory controller using the bypass path.
Abstract:
An apparatus includes a CPU core and a L1 cache subsystem including a L1 main cache, a L1 victim cache, and a L1 controller. The apparatus includes a L2 cache subsystem including a L2 main cache, a shadow L1 main cache, a shadow L1 victim cache, and a L2 controller configured to receive a read request from the L1 controller as a single transaction. Read request includes a read address, a first indication of an address and a coherence state of a cache line A to be moved from the L1 main cache to the L1 victim cache to allocate space for data returned in response to the read request, and a second indication of an address and a coherence state of a cache line B to be removed from the L1 victim cache in response to the cache line A being moved to the L1 victim cache.
Abstract:
An apparatus includes a CPU core and a L1 cache subsystem including a L1 main cache, a L1 victim cache, and a L1 controller. The apparatus includes a L2 cache subsystem coupled to the L1 cache subsystem by a transaction bus and a tag update bus. The L2 cache subsystem includes a L2 main cache, a shadow L1 main cache, a shadow L1 victim cache, and a L2 controller. The L2 controller receives a message from the L1 controller over the tag update bus, including a valid signal, an address, and a coherence state. In response to the valid signal being asserted, the L2 controller identifies an entry in the shadow L1 main cache or the shadow L1 victim cache having an address corresponding to the address of the message and updates a coherence state of the identified entry to be the coherence state of the message.
Abstract:
In described examples, a processor system includes a processor core that generates memory write requests, and a cache memory with a memory controller having a memory pipeline. The cache memory has cache lines of length L. The cache memory has a minimum write length that is less than a cache line length of the cache memory. The memory pipeline determines whether the data payload includes a first chunk and ECC syndrome that correspond to a partial write and are writable by a first cache write operation, and a second chunk and ECC syndrome that correspond to a full write operation that can be performed separately from the first cache write operation. The memory pipeline performs an RMW operation to store the first chunk and ECC syndrome in the cache memory, and performs the full write operation to store the second chunk and ECC syndrome in the cache memory.
Abstract:
An apparatus includes a CPU core and a L1 cache subsystem including a L1 main cache, a L1 victim cache, and a L1 controller. The apparatus includes a L2 cache subsystem coupled to the L1 cache subsystem by a transaction bus and a tag update bus. The L2 cache subsystem includes a L2 main cache, a shadow L1 main cache, a shadow L1 victim cache, and a L2 controller. The L2 controller receives a message from the L1 controller over the tag update bus, including a valid signal, an address, and a coherence state. In response to the valid signal being asserted, the L2 controller identifies an entry in the shadow L1 main cache or the shadow L1 victim cache having an address corresponding to the address of the message and updates a coherence state of the identified entry to be the coherence state of the message.
Abstract:
A system comprises a processor including a CPU core, first and second memory caches, and a memory controller subsystem. The memory controller subsystem speculatively determines a hit or miss condition of a virtual address in the first memory cache and speculatively translates the virtual address to a physical address. Associated with the hit or miss condition and the physical address, the memory controller subsystem configures a status to a valid state. Responsive to receipt of a first indication from the CPU core that no program instructions associated with the virtual address are needed, the memory controller subsystem reconfigures the status to an invalid state and, responsive to receipt of a second indication from the CPU core that a program instruction associated with the virtual address is needed, the memory controller subsystem reconfigures the status back to a valid state.