摘要:
A network application executing on a host system provides a list of application buffers in host memory stored in a queue to a network services processor coupled to the host system. The application buffers are used for storing data transferred on a socket established between the network application and a remote network application executing in a remote host system. Using the application buffers, data received by the network services processor over the network is transferred between the network services processor and the application buffers. After the transfer, a completion notification is written to one of the two control queues in the host system. The completion notification includes the size of the data transferred and an identifier associated with the socket. The identifier identifies a thread associated with the transferred data and the location of the data in the host system.
摘要:
A network application executing on a host system provides a list of application buffers in host memory stored in a queue to a network services processor coupled to the host system. The application buffers are used for storing data transferred on a socket established between the network application and a remote network application executing in a remote host system. Using the application buffers, data received by the network services processor over the network is transferred between the network services processor and the application buffers. After the transfer, a completion notification is written to one of the two control queues in the host system. The completion notification includes the size of the data transferred and an identifier associated with the socket. The identifier identifies a thread associated with the transferred data and the location of the data in the host system.
摘要:
In one embodiment, a system comprises a memory and a memory controller that provides a cache access path to the memory and a bypass-cache access path to the memory, receives requests to read graph data from the memory on the bypass-cache access path and receives requests to read non-graph data from the memory on the cache access path. A method comprises receiving a request at a memory controller to read graph data from a memory on a bypass-cache access path, receiving a request at the memory controller to read non-graph data from the memory through a cache access path, and arbitrating, in the memory controller, among the requests using arbitration.
摘要:
A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.
摘要:
A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself. Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.
摘要:
A method and architecture for improved system resource management and allocation for the processing of request and response messages in a computer system. The resource management scheme provides for dynamically sharing system resources, such as data buffers, between request and response messages or transactions. In particular, instead of simply dedicating a portion of the system resources to requests and the remaining portion to responses, a minimum amount of resources are reserved for responses and a minimum amount for requests, while the remaining resources are dynamically shared between both types of messages. The method and architecture of the present invention allows for more efficient use of system resources, while avoiding deadlock conditions and ensuring a minimum service rate for requests.
摘要:
In one embodiment, a system includes memory ports distributed into subsets identified by a subset index, where each memory port has an individual wait time based on a respective workload. The system further comprises a first address hashing unit configured to receive a read request including a virtual memory address associated with a replication factor and referring to graph data. The first address hashing unit translates the replication factor into a corresponding subset index based on the virtual memory address, and converts the virtual memory address to a hardware based memory address referring to graph data in the memory ports within a subset indicated by the corresponding subset index. The system further comprises a memory replication controller configured to direct read requests to the hardware based address to the one of the memory ports within the subset indicated by the corresponding subset index with a lowest individual wait time.
摘要:
In one embodiment, a system comprises multiple memory ports distributed into multiple subsets, each subset identified by a subset index and each memory port having an individual wait time. The system further comprises a first address hashing unit configured to receive a read request including a virtual memory address associated with a replication factor, and referring to graph data. The first address hashing unit translates the replication factor into a corresponding subset index based on the virtual memory address, and converts the virtual memory address to a hardware based memory address that refers to graph data in the memory ports within a subset indicated by the corresponding subset index. The system further comprises a memory replication controller configured to direct read requests to the hardware based address to the one of the memory ports within the subset indicated by the corresponding subset index with a lowest individual wait time.
摘要:
An efficient system and method for managing reads and writes on a bi-directional bus to optimize bus performance while avoiding bus contention and avoiding read/write starvation. In particular, by intelligently managing reads and writes on a bi-directional bus, bus latency can be reduced while still ensuring no bus contention or read/write starvation. This is accomplished by utilizing bus streaming control logic, separate queues for reads and writes, and a simple 2 to 1 mux.
摘要:
A system and method is disclosed to track a large number of open pages in a computer memory system. The computer system contains one or more processors each including a memory controller containing a page table, the page table organized into a plurality of rows with each row able to store an address of an open memory page. A RIMM module containing RDRAM devices is coupled to each processor, each RDRAM containing a plurality of memory banks. The page table increases system memory performance by tracking a large number of open memory pages. Associated with the page table is a bank active table that indicates the memory banks in each RDRAM device having open memory pages. The page table enqueues accesses to the RIMM module in a precharge queue resulting from a page miss caused by the address of an open memory page occupying the same row of the page table as the address of the system memory access resulting in the page miss. The page table also enqueues accesses to system memory in a Row-address-select (“RAS”) queue resulting from a page miss caused by a row of the page table not containing any open memory page address. The page table enqueues accesses to system memory resulting in page hits to open memory pages in a Column-address-select (“CAS”) queue. An entry in the precharge queue is then enqueued into the RAS queue. An entry in the RAS queue after completion is enqueued into the CAS Read or CAS Write queue.