摘要:
Examples may include techniques to enable cache coherency of objects in a distributed shared memory (DSM) system, even where multiple nodes in the system manage the objects. Node memory space includes a tracking address space (TAS) where lines in the TAS correspond to objects in the (DSM). Access to the objects and the TAS is managed by a host fabric interface (HFI) caching agent in HFI of a node.
摘要:
Loading data from a computer memory system is disclosed. A memory system is provided, wherein some or all data stored in the memory system is organized as one or more pointer-linked data structures. One or more iterator registers are provided. A first pointer chain is loaded, having two or more pointers leading to a first element of a selected pointer-linked data structure to a selected iterator register. A second pointer chain is loaded, having two or more pointers leading to a second element of the selected pointer-linked data structure to the selected iterator register. The loading of the second pointer chain reuses portions of the first pointer chain that are common with the second pointer chain.Modifying data stored in a computer memory system is disclosed. A memory system is provided. One or more iterator registers are provided, wherein the iterator registers each include two or more pointer fields for storing two or more pointers that form a pointer chain leading to a data element. A local state associated with a selected iterator register is generated by performing one or more register operations relating to the selected iterator register and involving pointers in the pointer fields of the selected iterator register. A pointer-linked data structure is updated in the memory system according to the local state.
摘要:
A method for building a multi-processor system with nodes having multiple cache coherency domains. In this system, a directory built in a node controller needs to include processor domain attribute information, and the information can be acquired by configuring cache coherency domain attributes of ports of the node controller connected to processors. In the disclosure herein, the node controller can support the multiple physical cache coherency domains in a node.
摘要:
The present application is directed to a control circuit that provides a directory configured to maintain a plurality of entries, wherein each entry can indicate sharing of resources, such as cache lines, by a plurality of agents/hosts. Control circuit of the present invention can further provide consolidation of one or more entries having a first format to a single entry having a second format when resources corresponding to the one or more entries are shared by the agents. First format can include an address and a pointer representing one of the agents, and the second format can include a sharing vector indicative of more than one of the agents. In another aspect, the second format can utilize, incorporate, and/or represent multiple entries that may be indicative of one or more resources based on a position in the directory.
摘要:
A device for and method of storing page table entries in a first cache. A first page table entry is received having a fragment field that contains address information for a requested first page and at least a second page logically adjacent to the first page. A second page table entry is generated from the first page table entry to be stored with the first page table entry. The second page table entry provides address information for the second page. The second page table entry has a configuration that is compatible with the first cache.
摘要:
The invention has application in implementation of large Symmetric Multiprocessor Systems with a large number of nodes which include processing elements and associated cache memories. The illustrated embodiment of the invention provides for interconnection of a large number of multiprocessor nodes while reducing over the prior art the size of directories for tracking of memory coherency throughout the system. The embodiment incorporates within the memory controller of each node, directory information relating to the current locations of memory blocks which allows for elimination at a higher level in the node controllers of a larger volume of directory information relating to the location of memory blocks. This arrangement thus allows for more efficient implementation of very large multiprocessor computer systems.
摘要:
A method for maintaining data coherency in a shared-memory computer system having a plurality of nodes divides the local memory of a given node into one or more blocks and stores a data record for each block indicating a plurality of node groups and a selection of the node groups. Each selected node group represents a number of nodes, and selected node groups represent at least one node that has requested access to the block. In response to receiving an access request from a requesting node that may or may not be in a selected node group, the method and system update the data record to indicate the correct selection. If the requesting node is not in any node group, the data record is adjusted to have new node groups, one of which represents the requesting node.
摘要:
A method and directory system that recognizes and represents the subset of sharing patterns present in an application is provided. As used herein, the term sharing pattern refers to a group of processors accessing a single memory location in an application. The sharing pattern is decoupled from each cache line and held in a separate directory table. The sharing pattern of a cache block is the bit vector representing the processors that share the block. Multiple cache lines that have the same sharing pattern point to a common entry in the directory table. In addition, when the table capacity is exceeded, patterns that are similar to each other are dynamically collated into a single entry.
摘要:
In a cache coherent data processing system including at least first and second coherency domains, a memory block is stored in a system memory in association with a domain indicator indicating whether or not the memory block is cached, if at all, only within the first coherency domain. A master in the first coherency domain determines whether or not a scope of broadcast transmission of an operation should extend beyond the first coherency domain by reference to the domain indicator stored in the cache and then performs a broadcast of the operation within the cache coherent data processing system in accordance with the determination.
摘要:
In one embodiment, a first node comprises at least one memory request source and a node controller coupled to the memory request source. The node controller comprises a remote hit predictor configured to predict a second node to have a coherent copy of a block addressed by a memory request generated by the memory request source, and the node controller is configured to issued a speculative probe to the second node responsive to the prediction and to the memory request.