摘要:
A method for maintaining cache coherency for a multi-node system using a specialized bridge which allows for fewer forward progress dependencies. A look-up of a local node directory is performed if a request received at a multi-node bridge of the local node is a system request. If a directory entry indicates that data specified in the request has a local owner or local destination, the request is forwarded to the local node. If the local node determines that the request is a local request, a look-up of the local node directory is performed. If the directory entry indicates that data specified in the request has a local owner and local destination, the coherency of the data on the local node is resolved and a transfer of the request data is performed if required. Otherwise, the request is forwarded to all remote nodes in the multi-node system.
摘要:
A method for maintaining cache coherency for a multi-node system using a specialized bridge which allows for fewer forward progress dependencies. A local node makes a determination whether a request is a local or system request. If the request is a local request, a look-up of a directory in the local node is performed. If an entry in the directory of the local node indicates that data in the request does not have a remote owner and that the request does not have a remote destination, the coherency of the data is resolved on the local node, and a transfer of the data specified in the request is performed if required and if the request is a local request. If the entry indicates that the data has a remote owner or that the request has a remote destination, the request is forwarded to all remote nodes in the multi-node system.
摘要:
A method for maintaining cache coherency for a multi-node system using a specialized bridge which allows for fewer forward progress dependencies. A look-up of a local node directory is performed if a request received at a multi-node bridge of the local node is a system request. If a directory entry indicates that data specified in the request has a local owner or local destination, the request is forwarded to the local node. If the local node determines that the request is a local request, a look-up of the local node directory is performed. If the directory entry indicates that data specified in the request has a local owner and local destination, the coherency of the data on the local node is resolved and a transfer of the request data is performed if required. Otherwise, the request is forwarded to all remote nodes in the multi-node system.
摘要:
A method for maintaining cache coherency for a multi-node system using a specialized bridge which allows for fewer forward progress dependencies. A local node makes a determination whether a request is a local or system request. If the request is a local request, a look-up of a directory in the local node is performed. If an entry in the directory of the local node indicates that data in the request does not have a remote owner and that the request does not have a remote destination, the coherency of the data is resolved on the local node, and a transfer of the data specified in the request is performed if required and if the request is a local request. If the entry indicates that the data has a remote owner or that the request has a remote destination, the request is forwarded to all remote nodes in the multi-node system.
摘要:
An integrated circuit includes a testable delay path. A transition of a delay path input signal causes a subsequent transition of a delay path output signal. A pulse generator receives the delay path input and output signals and produces a pulse signal having a pulse width indicative of the delay between the delay path input and output signal transitions. A delay line receives the pulse signal from the pulse generator. The delay line generates information indicative of the pulse signal pulse width. The delay line may include multiple stages in series where each stage reduces the pulse width of the pulse signal. The delay line may include a high skew inverter having PMOS and NMOS transistors having significantly different gains. The pulse generator is configured to produce a positive going pulse signal regardless of whether the delay path is inverting or non-inverting.
摘要:
According to an apparatus form of the invention, oscillator circuitry for operating a number of inverters in a loop (also known as a “ring”) includes a number of inverters. The inverters include a series of M inverters and a series of N inverters. The M inverters have signal propagation delay of m and the N inverters have signal propagation delay of n. The circuitry also includes means for selecting whether to exclude the N inverters from operating in the loop operable for receiving a select signal on a data input. The selecting means times assertion of the select signal on an output to select the number of inverters. In order to glitchlessly change the number of inverters operating in the loop, the selecting means has a certain delay greater than delay n.
摘要:
A method for avoiding data loss due to cancelled transactions within a non-uniform memory access (NUMA) data processing system is disclosed. A NUMA data processing system includes a node interconnect to which at least a first node and a second node are coupled. The first and the second nodes each includes a local interconnect, a system memory coupled to the local interconnect, and a node controller interposed between the local interconnect and a node interconnect. The node controller detects certain situations which, due to the nature of a NUMA data processing system, can lead to data loss. These situations share the common feature that a node controller ends up with the only copy of a modified cache line and the original transaction that requested the modified cache line may not be issued again with the same tag or may not be issued again at all. The node controller corrects these situations by issuing its own write transaction to the system memory for that modified cache line using its own tag, and then providing the data the modified cache line is holding. This ensures that the modified data will be written to the system memory.
摘要:
A method and system for an infrastructure for performance-based chip-to-chip stacking are provided in the illustrative embodiments. A critical path monitor circuit (infrastructure) is configured to launch a signal from a launch point in a first layer, the first layer being a first circuit. The infrastructure is further configured to create an electrical path to a capture point. The signal is launched from the launch point in the first layer. A performance characteristic of the electrical path is measured, resulting in a measurement, wherein the measurement is indicative of a performance of the first layer when stacked with a second layer in a 3D stack without actually stacking the first and the second layers in the 3D stack, the second layer being a second circuit.
摘要:
A method for avoiding livelocks due to colliding invalidating transactions within a non-uniform memory access system is disclosed. A NUMA computer system includes at least two nodes coupled to an interconnect. Each of the two nodes includes a local system memory. In response to a request by a processor of a first node to invalidate a remote copy of a cache line also stored within its cache memory at substantially the same time when a processor of a second node is also requesting to invalidate said cache line, one of the two requests is allowed to complete. The allowed request is the first request to complete without retry at the point of coherency, typically the home node. Subsequently, the other one of the two requests is permitted to complete.
摘要:
A data recirculation apparatus for a data processing system includes at least one output buffer from which data are output onto an interconnect, a plurality of input storage areas from which data are selected for storage within the output buffer, and selection logic that selects data from the plurality of input storage areas for storage within the output buffer. In addition, the data recirculation apparatus includes buffer control logic that, in response to a determination that a particular datum has stalled in the output buffer, causes the particular datum to be removed from the output buffer and stored in one of the plurality of input storage areas. In one embodiment, the recirculated data has a dedicated input storage area.