Abstract:
Providing scalable dynamic random access memory (DRAM) cache management using tag directory caches is provided. In one aspect, a DRAM cache management circuit is provided to manage access to a DRAM cache in a high-bandwidth memory. The DRAM cache management circuit comprises a tag directory cache and a tag directory cache directory. The tag directory cache stores tags of frequently accessed cache lines in the DRAM cache, while the tag directory cache directory stores tags for the tag directory cache. The DRAM cache management circuit uses the tag directory cache and the tag directory cache directory to determine whether data associated with a memory address is cached in the DRAM cache of the high-bandwidth memory. Based on the tag directory cache and the tag directory cache directory, the DRAM cache management circuit may determine whether a memory operation can be performed using the DRAM cache and/or a system memory DRAM.
Abstract:
Self-aware, peer-to-peer cache transfers between local, shared cache memories in a multi-processor system is disclosed. A shared cache memory system is provided comprising local shared cache memories accessible by an associated central processing unit (CPU) and other CPUs in a peer-to-peer manner. When a CPU desires to request a cache transfer (e.g., in response to a cache eviction), the CPU acting as a master CPU issues a cache transfer request. In response, target CPUs issue snoop responses indicating their willingness to accept the cache transfer. The target CPUs also use the snoop responses to be self-aware of the willingness of other target CPUs to accept the cache transfer. The target CPUs willing to accept the cache transfer use a predefined target CPU selection scheme to determine its acceptance of the cache transfer. This can avoid a CPU making multiple requests to find a target CPU for a cache transfer.
Abstract:
Maintaining cache coherency using conditional intervention among multiple master devices is disclosed. In one aspect, a conditional intervention circuit is configured to receive intervention responses from multiple snooping master devices. To select a snooping master device to provide intervention data, the conditional intervention circuit determines how many snooping master devices have a cache line granule size the same as or larger than a requesting master device. If one snooping master device has a same or larger cache line granule size, that snooping master device is selected. If more than one snooping master device has a same or larger cache line granule size, a snooping master device is selected based on an alternate criteria. The intervention responses provided by the unselected snooping master devices are canceled by the conditional intervention circuit, and intervention data from the selected snooping master device is provided to the requesting master device.
Abstract:
Aspects disclosed herein include avoiding deadlocks in processor-based systems employing retry and in-order-response non-retry bus coherency protocols. In this regard, an interface bridge circuit is communicatively coupled to a first core device that implements a retry bus coherency protocol, and a second core device that implements an in-order-response non-retry bus coherency protocol. The interface bridge circuit receives a snoop command from the first core device, and forwards the snoop command to the second core device. While the snoop command is pending, the interface bridge circuit detects a potential deadlock condition between the first core device and the second core device. In response to detecting the potential deadlock condition, the interface bridge circuit is configured to send a retry response to the first core device. This enables the first core device to continue processing, thereby eliminating the potential deadlock condition.
Abstract:
Bridging strongly ordered write transactions to devices in weakly ordered domains, and related apparatuses, methods, and computer-readable media are disclosed. In one aspect, a host bridge device is configured to receive strongly ordered write transactions from one or more strongly ordered producer devices. The host bridge device issues the strongly ordered write transactions to one or more consumer devices within a weakly ordered domain. The host bridge device detects a first write transaction that is not accepted by a first consumer device of the one or more consumer devices. For each of one or more write transactions issued subsequent to the first write transaction and accepted by a respective consumer device, the host bridge device sends a cancellation message to the respective consumer device. The host bridge device replays the first write transaction and the one or more write transactions that were issued subsequent to the first write transaction.