Abstract:
A method and apparatus for computing a Packed Absolute Differences. According to one such method and apparatus, a third packed data having a third plurality of elements and the plurality of sign bits is produced, each of the third plurality of elements and the plurality of sign bits being computed by subtracting one of a first plurality of elements of a first packed data from a corresponding one of a second plurality of elements of a second packed data. The third plurality of elements and the plurality of sign bits are stored. A fourth packed data having a fourth plurality of elements is produced, each of the fourth plurality of elements being computed by subtracting one of the third plurality of elements from the corresponding one of an at least one element, if the corresponding one of a plurality of sign bits is in a first state; and adding one of the third plurality of elements from the corresponding one of the at least one element, if the corresponding one of the plurality of sign bits is in a second state.
Abstract:
A method and apparatus are provided for executing scalar packed data instructions. According to one aspect of the invention, a processor includes a plurality of registers, a register renaming unit coupled to the plurality of registers, a decoder coupled to the register renaming unit, and a partial-width execution unit coupled to the decoder. The register renaming unit provides an architectural register file to store packed data operands each of which include a plurality of data elements. The decoder is configured to decode a first and second set of instructions that each specify one or more registers in the architectural register file. Each of the instructions in the first set of instructions specify operations to be performed on all of the data elements stored in the one or more specified registers. In contrast, each of the instructions in the second set of instructions specify operations to be performed on only a subset of the data element stored in the one or more specified registers. The partial-width execution unit is configured to execute operations specified by either of the first or the second set of instructions.
Abstract:
A unified architecture for dynamic generation, execution, synchronization and parallelization of complex instructions formats includes a virtual register file, register cache and register file hierarchy. A self-generating and synchronizing dynamic and static threading architecture provides efficient context switching.
Abstract:
A unified architecture for dynamic generation, execution, synchronization and parallelization of complex instructions formats includes a virtual register file, register cache and register file hierarchy. A self-generating and synchronizing dynamic and static threading architecture provides efficient context switching.
Abstract:
A technique includes in response to a training mode, communicating between a device and a processor of a computer system over a data bit line of a bus. The technique includes based on the communication, regulating a timing between a strobe signal and a signal that propagates over the data bit line.
Abstract:
A request hint is issued prior to or while identifying whether requested data and/or one or more instructions are in a first memory. A second memory is accessed to fetch data and/or one or more instructions in response to the request hint. The data and/or instruction(s) accessed from the second memory are stored in a buffer. If the requested data and/or instruction(s) are not in the first memory, the data and/or instruction(s) are returned from the buffer.
Abstract:
A prefetcher to prefetch data for an instruction based on the distance between cache misses caused by the instruction. In an embodiment, the prefetcher includes a memory to store a prefetch table that contains one or more entries that include the distance between cache misses caused by an instruction. In a further embodiment, the addresses of data elements prefetched are determined based on the distance between cache misses recorded in the prefetch table for the instruction.