摘要:
A method and architecture for improved system resource management and allocation for the processing of request and response messages in a computer system. The resource management scheme provides for dynamically sharing system resources, such as data buffers, between request and response messages or transactions. In particular, instead of simply dedicating a portion of the system resources to requests and the remaining portion to responses, a minimum amount of resources are reserved for responses and a minimum amount for requests, while the remaining resources are dynamically shared between both types of messages. The method and architecture of the present invention allows for more efficient use of system resources, while avoiding deadlock conditions and ensuring a minimum service rate for requests.
摘要:
A system and method is disclosed to track a large number of open pages in a computer memory system. The computer system contains one or more processors each including a memory controller containing a page table, the page table organized into a plurality of rows with each row able to store an address of an open memory page. A RIMM module containing RDRAM devices is coupled to each processor, each RDRAM containing a plurality of memory banks. The page table increases system memory performance by tracking a large number of open memory pages. Associated with the page table is a bank active table that indicates the memory banks in each RDRAM device having open memory pages. The page table enqueues accesses to the RIMM module in a precharge queue resulting from a page miss caused by the address of an open memory page occupying the same row of the page table as the address of the system memory access resulting in the page miss. The page table also enqueues accesses to system memory in a Row-address-select (“RAS”) queue resulting from a page miss caused by a row of the page table not containing any open memory page address. The page table enqueues accesses to system memory resulting in page hits to open memory pages in a Column-address-select (“CAS”) queue. An entry in the precharge queue is then enqueued into the RAS queue. An entry in the RAS queue after completion is enqueued into the CAS Read or CAS Write queue.
摘要:
A computer system includes a memory controller interfacing the processor to a memory system. The memory controller supports a memory system with a plurality of memory devices, with multiple memory banks in each memory device. The memory controller supports simultaneous memory accesses to different memory banks. Memory bank conflicts are avoided by examining each transaction before it is loaded in the memory transaction queue. On a first clock cycle, the new pending memory request is transferred from a pending request queue to a memory mapper. On the subsequent clock cycle, the memory mapper formats the pending memory request into separate signals identifying the DEVICE, BANK, ROW and COLUMN to be accessed by the pending transaction. In the next clock cycle, the DEVICE and BANK signals are compared with every entry in the memory transaction queue to determine if a bank conflict exists. If so, the new memory request is rejected and recycled to the pending request queue.
摘要:
A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.
摘要:
A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself. Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.
摘要:
An efficient system and method for managing reads and writes on a bi-directional bus to optimize bus performance while avoiding bus contention and avoiding read/write starvation. In particular, by intelligently managing reads and writes on a bi-directional bus, bus latency can be reduced while still ensuring no bus contention or read/write starvation. This is accomplished by utilizing bus streaming control logic, separate queues for reads and writes, and a simple 2 to 1 mux.
摘要:
An efficient system and method for managing reads and writes on a bi-directional bus to optimize bus performance while avoiding bus contention and avoiding read/write starvation. In particular by intelligently managing reads and writes on a bi-directional bus, bus latency can be reduced while still ensuring no bus contention or read/write starvation. This is accomplished by utilizing bus streaming control logic, separate queues for reads and writes, and a simple 2 to 1 mux.
摘要:
A computer system contains a processor that includes a software programmable memory mapper. The memory mapper maps an address generated by the processor into a device address for accessing physical main memory. The processor also includes a cache controller that maps the processor address into a cache address. The cache address places a block of data from main memory into a memory cache using an index subfield. The physical main memory contains RDRAM devices, each of the RDRAM devices containing a number of memory banks that store rows and columns of data. The memory mapper maps processor addresses to device addresses to increases memory system performance. The mapping minimizes memory access conflicts between the memory banks. Conflicts between memory banks are reduced by placing a number of bits corresponding to the bank subfield above the most significant boundary bit of the index subfield. This diminishes page misses caused by replacement of data blocks from the cache memory because the read of the new data block and write of the victim data block are not to the same memory bank. Adjacent memory bank conflicts are reduced for sequential accesses to memory banks by reversing the bit order of a bank number subfield within the bank subfield of the device address.
摘要:
A network transport layer accelerator accelerates processing of packets so that packets can be forwarded at wire-speed. To accelerate processing of packets, the accelerator performs pre-processing on a network transport layer header encapsulated in a packet for a connection and performs in-line network transport layer checksum insertion prior to transmitting a packet. A timer unit in the accelerator schedules processing of the received packets. The accelerator also includes a free pool allocator which manages buffers for storing the received packets and a packet order unit which synchronizes processing of received packets for a same connection.
摘要:
In one embodiment, a system comprises a memory and a memory controller that provides a cache access path to the memory and a bypass-cache access path to the memory, receives requests to read graph data from the memory on the bypass-cache access path and receives requests to read non-graph data from the memory on the cache access path. A method comprises receiving a request at a memory controller to read graph data from a memory on a bypass-cache access path, receiving a request at the memory controller to read non-graph data from the memory through a cache access path, and arbitrating, in the memory controller, among the requests using arbitration.