Abstract:
Techniques are described for a multi-processor having two or more processors that increases the opportunity for a load-exclusive command to take a cache line in an Exclusive state, which results in increased performance when a store-exclusive is executed. A new bus operation read prefer exclusive is used as a hint to other caches that a requesting master is likely to store to the cache line, and, if possible, the other cache should give the line up. In most cases, this will result in the other master giving the line up and the requesting master taking the line Exclusive. In most cases, two or more processors are not performing a semaphore management sequence to the same address at the same time. Thus, a requesting master's load-exclusive is able to take a cache line in the Exclusive state an increased number of times.
Abstract:
Methods and systems are disclosed for full-hardware management of power and clock domains related to a distributed virtual memory (DVM) network. An aspect includes transmitting, from a DVM initiator to a DVM network, a DVM operation, broadcasting, by the DVM network to a plurality of DVM targets, the DVM operation, and, based on the DVM operation being broadcasted to the plurality of DVM targets by the DVM network, performing one or more hardware optimizations comprising: turning on a clock domain coupled to the DVM network or a DVM target of the plurality of DVM targets that is a target of the DVM operation, increasing a frequency of the clock domain, turning on a power domain coupled to the DVM target based on the power domain being turned off, or terminating the DVM operation to the DVM target based on the DVM target being turned off.
Abstract:
Bridging strongly ordered write transactions to devices in weakly ordered domains, and related apparatuses, methods, and computer-readable media are disclosed. In one aspect, a host bridge device is configured to receive strongly ordered write transactions from one or more strongly ordered producer devices. The host bridge device issues the strongly ordered write transactions to one or more consumer devices within a weakly ordered domain. The host bridge device detects a first write transaction that is not accepted by a first consumer device of the one or more consumer devices. For each of one or more write transactions issued subsequent to the first write transaction and accepted by a respective consumer device, the host bridge device sends a cancellation message to the respective consumer device. The host bridge device replays the first write transaction and the one or more write transactions that were issued subsequent to the first write transaction.
Abstract:
Processor-based system hybrid ring bus interconnects, and related devices, systems, and methods are disclosed. In one embodiment, a processor-based system hybrid ring bus interconnect is provided. The processor-based system hybrid ring bus interconnect includes multiple ring buses, each having a bus width and configured to receive bus transaction messages from a requester device(s). The processor-based system hybrid ring bus interconnect also includes an inter-ring router(s) coupled to the ring buses. The inter-ring router(s) is configured to dynamically direct bus transaction messages among the ring buses based on bandwidth requirements of the requester device(s). Thus, less power is consumed than by a crossbar interconnect due to simpler switching configurations. Further, the inter-ring router(s) allows for provision of multiple ring buses that can be dynamically activated and deactivated based on bandwidth requirements. This provides conservation of power when full bandwidth requirements on the processor-based system hybrid ring bus interconnect are not required.
Abstract:
Maintaining cache coherency using conditional intervention among multiple master devices is disclosed. In one aspect, a conditional intervention circuit is configured to receive intervention responses from multiple snooping master devices. To select a snooping master device to provide intervention data, the conditional intervention circuit determines how many snooping master devices have a cache line granule size the same as or larger than a requesting master device. If one snooping master device has a same or larger cache line granule size, that snooping master device is selected. If more than one snooping master device has a same or larger cache line granule size, a snooping master device is selected based on an alternate criteria. The intervention responses provided by the unselected snooping master devices are canceled by the conditional intervention circuit, and intervention data from the selected snooping master device is provided to the requesting master device.
Abstract:
Maintaining cache coherency using conditional intervention among multiple master devices is disclosed. In one aspect, a conditional intervention circuit is configured to receive intervention responses from multiple snooping master devices. To select a snooping master device to provide intervention data, the conditional intervention circuit determines how many snooping master devices have a cache line granule size the same as or larger than a requesting master device. If one snooping master device has a same or larger cache line granule size, that snooping master device is selected. If more than one snooping master device has a same or larger cache line granule size, a snooping master device is selected based on an alternate criteria. The intervention responses provided by the unselected snooping master devices are canceled by the conditional intervention circuit, and intervention data from the selected snooping master device is provided to the requesting master device.
Abstract:
Methods and apparatus for transaction ordering to avoid bus deadlocks are provided. In an exemplary method, custom routing rules for data transport are defined for data transport between a plurality of masters and a plurality of slaves via a plurality of interconnects, based on a network topology and traffic profile. In an example, the customized rule allows a request address to arbitrate in a first phase of arbitration at a first interconnect in the plurality of interconnects prior to receiving write data associated with the request address at a second interconnect in the plurality of interconnects, and does not allow the request address to arbitrate during a subsequent second phase of arbitration unless the request address beats other competing address requests.
Abstract:
Aspects disclosed herein include avoiding deadlocks in processor-based systems employing retry and in-order-response non-retry bus coherency protocols. In this regard, an interface bridge circuit is communicatively coupled to a first core device that implements a retry bus coherency protocol, and a second core device that implements an in-order-response non-retry bus coherency protocol. The interface bridge circuit receives a snoop command from the first core device, and forwards the snoop command to the second core device. While the snoop command is pending, the interface bridge circuit detects a potential deadlock condition between the first core device and the second core device. In response to detecting the potential deadlock condition, the interface bridge circuit is configured to send a retry response to the first core device. This enables the first core device to continue processing, thereby eliminating the potential deadlock condition.
Abstract:
Aspects disclosed herein include avoiding deadlocks in processor-based systems employing retry and in-order-response non-retry bus coherency protocols. In this regard, an interface bridge circuit is communicatively coupled to a first core device that implements a retry bus coherency protocol, and a second core device that implements an in-order-response non-retry bus coherency protocol. The interface bridge circuit receives a snoop command from the first core device, and forwards the snoop command to the second core device. While the snoop command is pending, the interface bridge circuit detects a potential deadlock condition between the first core device and the second core device. In response to detecting the potential deadlock condition, the interface bridge circuit is configured to send a retry response to the first core device. This enables the first core device to continue processing, thereby eliminating the potential deadlock condition.
Abstract:
Techniques are described for a multi-processor having two or more processors that increases the opportunity for a load-exclusive command to take a cache line in an Exclusive state, which results in increased performance when a store-exclusive is executed. A new bus operation read prefer exclusive is used as a hint to other caches that a requesting master is likely to store to the cache line, and, if possible, the other cache should give the line up. In most cases, this will result in the other master giving the line up and the requesting master taking the line Exclusive. In most cases, two or more processors are not performing a semaphore management sequence to the same address at the same time. Thus, a requesting master's load-exclusive is able to take a cache line in the Exclusive state an increased number of times.