摘要:
In one embodiment, a processor comprises a coherence trap unit and a trap logic coupled to the coherence trap unit. The coherence trap unit is also coupled to receive data accessed in response to the processor executing a memory operation. The coherence trap unit is configured to detect that the data matches a designated value indicating that a coherence trap is to be initiated to coherently perform the memory operation. The trap logic is configured to trap to a designated software routine responsive to the coherence trap unit detecting the designated value. In some embodiments, a cache tag in a cache may track whether or not the corresponding cache line has the designated value, and the cache tag may be used to trigger a trap in response to an access to the corresponding cache line.
摘要:
In one embodiment, a processor comprises a coherence trap unit and a trap logic coupled to the coherence trap unit. The coherence trap unit is also coupled to receive data accessed in response to the processor executing a memory operation. The coherence trap unit is configured to detect that the data matches a designated value indicating that a coherence trap is to be initiated to coherently perform the memory operation. The trap logic is configured to trap to a designated software routine responsive to the coherence trap unit detecting the designated value. In some embodiments, a cache tag in a cache may track whether or not the corresponding cache line has the designated value, and the cache tag may be used to trigger a trap in response to an access to the corresponding cache line.
摘要:
In one embodiment, a method comprises communicating with one or more other nodes in a system from a first node in the system in response to a trap experienced by a processor in the first node during a memory operation, wherein the trap is signalled in the processor in response to one or more permission bits stored with a cache line in a cache accessible during performance of the memory operation; determining that the cache line is part of a memory transaction in a second node that is one of the other nodes, wherein a memory transaction comprises two or more memory operations that appear to execute atomically in isolation; and resolving a conflict between the memory operation and the memory transaction.
摘要:
In one embodiment, a memory controller for a node in a multi-node computer system comprises logic and a control unit. The logic is configured to determine if an address corresponding to a request received by the memory controller on an intranode interconnect is a remote address or a local address. A first portion of the memory in the node is allocated to store copies of remote data and a remaining portion stores local data. The control unit is configured to write writeback data to a location in the first portion. The writeback data corresponds to a writeback request from the intranode interconnect that has an associated remote address detected by the logic. The control unit is configured to determine the location responsive to the associated remote address and one or more indicators that identify the first portion in the memory.
摘要:
A system may include a plurality of nodes coupled by an inter-node network. Each of the nodes includes several active devices, an interface to the inter-node network, and an address network coupling the active devices to the interface. An active device included in one of the nodes initiates a transaction by sending either a first type of address packet or a second type of address packet on the address network dependent on whether the active device is included in a multi-node system. The first type of address packet is sent if the active device is included in a multi-node system and is not snooped by other active devices in the same node as the active device. The second type of address packet, sent if the active device is included in a single node system, is snooped by other active devices in the same node as the active device.
摘要:
A system may include a node and an additional node coupled by an inter-node network. The node may include an active device, an interface to the inter-node network, a memory, and an address network coupling the active device, the interface, and the memory. The active device may send an address packet to initiate a transaction to gain an access right to a coherency unit. In response to receiving the address packet, the memory is configured to send data corresponding to the coherency unit to the active device dependent on memory response information associated with the coherency unit. If the transaction cannot be satisfied within the node, the memory is configured to forward a report corresponding to the address packet to the interface. In response to the report, the interface is configured to send the additional node a coherency message requesting the access right via the inter-node network.
摘要:
Embodiments of the present invention implement virtual transactional memory using cache line marking. The system starts by executing a starvation-avoiding transaction for a thread. While executing the starvation-avoiding transaction, the system places starvation-avoiding load-marks on cache lines which are loaded from and places starvation-avoiding store-marks on cache lines which are stored to. Next, while swapping a page out of a memory and to a disk during the starvation-avoiding transaction, the system determines if one or more cache lines in the page have a starvation-avoiding load-mark or a starvation-avoiding store-mark. If so, upon swapping the page into the memory from the disk, the system places a starvation-avoiding load-mark on each cache line that had a starvation-avoiding load-mark and places a starvation-avoiding store-mark on each cache line that had a starvation-avoiding store-mark.
摘要:
Embodiments of the present invention implement virtual transactional memory using cache line marking. The system starts by executing a starvation-avoiding transaction for a thread. While executing the starvation-avoiding transaction, the system places starvation-avoiding load-marks on cache lines which are loaded from and places starvation-avoiding store-marks on cache lines which are stored to. Next, while swapping a page out of a memory and to a disk during the starvation-avoiding transaction, the system determines if one or more cache lines in the page have a starvation-avoiding load-mark or a starvation-avoiding store-mark. If so, upon swapping the page into the memory from the disk, the system places a starvation-avoiding load-mark on each cache line that had a starvation-avoiding load-mark and places a starvation-avoiding store-mark on each cache line that had a starvation-avoiding store-mark.
摘要:
Embodiments of the present invention provide a system that dynamically reconfigures memory. During operation, the system determines that a virtual memory page is to be reconfigured from an original virtual-address-to-physical-address mapping to a new virtual-address-to-physical-address mapping. The system then determines a new real address mapping for a set of virtual addresses in the virtual memory page by selecting a range of real addresses for the virtual addresses that are arranged according to the new virtual-address-to-physical-address mapping. Next, the system temporarily disables accesses to the virtual memory page. Then, the system copies data from real address locations indicated by the original virtual-address-to-physical-address mapping to real address locations indicated by the new virtual-address-to-physical-address mapping. Next, the system updates the real-address-to-physical-address mapping for the page, and re-enables accesses to the virtual memory page.
摘要:
A system may include a plurality of nodes. Each node may include an active device and a memory subsystem coupled to the active device. An active device in one of the nodes is configured to generate a global address that identifies a coherency unit and associated translation information identifying a translation function to be performed on the global address. A memory subsystem included in the node is configured to perform the translation function identified by the translation information on the global address to generate a physical address of the coherency unit within the memory subsystem. An additional memory subsystem included in an additional one of the nodes is configured to store the translation information identifying the translation function used in the node. In response to a request for access to the coherency unit, the additional memory subsystem is configured to send the translation information to the node.