摘要:
A method and apparatus for reducing snoop stall on an external bus. One method of the present invention comprises retrieving an address and a transaction attribute for a bus transaction during a first of a plurality of request phase packets of the bus transaction. Then it is determined whether the bus transaction is a snoopable memory transaction or not. If the bus transaction is a snoopable memory transaction, a snoop probe is dispatched during the first request phase packet of the transaction. Snooping devices are allowed additional bus clocks to respond to the snoop probe, thereby reducing the number of snoop stalls required to be inserted during the bus transaction.
摘要:
A method for charge sharing among data conductors of a bus. The bus has a first data conductor and a corresponding data conductor. The method includes detecting the logic levels on the first data conductor and the corresponding data conductor, and generating a charge sharing signal for sharing charge between the first data conductor and the corresponding data conductor.
摘要:
The present invention discloses a method and apparatus method for efficient utilization of write-combining buffers for a sequence of non-temporal stores to scattered locations. The method comprises: converting the sequence of non-temporal stores to stores to intermediate buffers; and grouping the stores to intermediate buffers into consecutive non-temporal stores. The consecutive non-temporal stores correspond to adjacent memory locations in the write-combining buffers.
摘要:
A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
摘要:
A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
摘要:
A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
摘要:
The amount of data that may be transferred between a processing unit and a memory may be increased by transferring information during both the high and low phases of a clock. As one example, in a graphics processor using a general purpose register file as a memory and a mathematical box as a processing unit, the amount of data that can be transferred can be increased by transferring data during both the high and low phases of a clock.
摘要:
A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
摘要:
In one embodiment, the present invention includes a cache memory, which may be a sequential cache, having multiple banks. Each of the banks includes a data array, a decoder coupled to the data array to select a set of the data array, and a sense amplifier. Only a bank to be accessed may be powered, and in some embodiments early way information may be used to maintain remaining banks in a power reduced state. In some embodiments, clock gating may be used to maintain various components of the cache memory in a power reduced state. Other embodiments are described and claimed.
摘要:
A method and apparatus for a trace end predictor for a trace cache is disclosed. In one embodiment, the trace end predictor may have one or more buffers to contain a head address for a subsequent trace. The head address may include the way number and set number of the next head, along with partial stew data to support additional execution predictors. The buffers may also include tag data of the current trace's tail address, and may additionally include control bits for determining whether to replace the buffer's contents with information from another trace's tail. Reading the next head address from the trace end predictor, as opposed to reading it from the trace cache array, may reduce certain execution time delays.