摘要:
Embodiments of the present invention relate to accessing a first pair of adjacent data blocks using a first channel of a dual channel memory device; and simultaneously accessing a second pair of adjacent data blocks using a second channel of the memory device, the second pair being spaced apart from the first pair by a predetermined interval.
摘要:
A programmable multi-cycle signaling scheme provides synchronous communications over relatively large distances. An input digital data stream is de-multiplexed onto multiple conductors. The digital data stream is recreated at the far end of the conductors.
摘要:
According to one embodiment, computer system is disclosed. The computer system includes a central processing unit (CPU), a bus coupled to the CPU and a chipset coupled to the bus. The chipset includes compensation circuitry to compensate for process, voltage and temperature (PVT) effects attributed to a voltage change on the bus.
摘要:
A method and apparatus for open loop buffer allocation. In one embodiment, the method includes loading requested data within a buffer according to a load rate. Concurrent with the loading of data within the buffer, the data is forwarded from the buffer according to drain rate. In situations where the load rate exceeds the drain rate, read requests may be throttled according to an approximate buffer capacity level to prohibit buffer overflow. In one embodiment, a rate for issuing data requests, for example, to memory, is regulated according to a predetermined buffer accumulation rate. Accordingly, in one embodiment, the open loop allocation scheme reduces latency while enabling sustained read streaming with a minimal size read buffer. Other embodiments are described and claimed.
摘要:
An apparatus and a method for adjusting signal timing in a memory interface have been disclosed. One embodiment of the apparatus includes a number of slave delay lock loops (DLLs) in a memory interface to adjust timing between a number of signals to compensate for timing skew, and a number of input/output (I/O) buffers to output the adjusted signals to one or more memory devices coupled to the memory interface. Other embodiments are described and claimed.
摘要:
A method and apparatus for dedicating cache entries to certain streams for performance optimization are disclosed. The method according to the present techniques comprises partitioning a cache array into one or more special-purpose entries and one or more general-purpose entries, wherein special-purpose entries are only allocated for one or more streams having a particular stream ID.
摘要:
A memory controller is coupled to a memory device via a memory channel. The memory controller includes a command-per-clock detection unit that compares a portion of a current address with a portion of a previous address. If there is a match, then the memory controller can continue to assert a chip select line coupled to the memory device. The command-per-clock detection unit checks to see whether only certain low-order bits of the address lines are toggling between the current and previous addresses. Additional copies of address lines for particular low-order bits are provided to the memory device to reduce loading on the low order bit address lines, allowing the low order bit address lines to toggle quickly in order to avoid the necessity of inserting a one clock period wait state. If the command-per-clock detection unit does not find a match (meaning that more than the low order address bits are toggling) then the wait state is inserted by deasserting the chip select line for a clock period.
摘要:
A combination of techniques to prevent deadlocks and livelocks in a computer system having a dispatcher and multiple downstream command queues. In one embodiment, a broadcast transaction that requires simultaneously available space in all the affected downstream command queues becomes a delayed transaction, so that the command queues are reserved and other transactions are retried until the broadcast transaction is completed. In another embodiment, a bail-out timer is used to defer a transaction if the transaction does not complete within a predetermined time. In yet another embodiment, a locked transaction that potentially addresses memory space controlled by a programmable attribute map is handled as a delayed transaction if there is less than a predetermined amount of downstream buffer space available for the transaction.
摘要:
A write cache that reduces the number of memory accesses required to write data to main memory. When a memory write request is executed, the request not only updates the relevant location in cache memory, but the request is also directed to updating the corresponding location in main memory. A separate write cache is dedicated to temporarily holding multiple write requests so that they can be organized for more efficient transmission to memory in burst transfers. In one embodiment, all writes within a predefined range of addresses can be written to memory as a group. In another embodiment, entries are held in the write cache until a minimum number of entries are available for writing to memory, and a least-recently-used mechanism can be used to decide which entries to transmit first. In yet another embodiment, partial writes are merged into a single cache line, to be written to memory in a single burst transmission.
摘要:
A technique to reduce accumulated latencies in bus transmission time when a bus inversion scheme is employed. The bus inversion scheme inverts all the data bits whenever more than one-half of the data bits are active, so that the bus never has more that one-half of the bits active during a data transfer. This minimizes the number of driver circuits that are actively driving the bus at any given time. Since it takes a certain amount to time to determine if more than one-half of the bits are active, this process can add to overall latency, or data transfer time on the bus. By placing the bus inversion function in parallel with another function that also contributes to bus latency, such as error correction code (ECC) calculation, only the more time-consuming of the two functions will increase bus latency.