摘要:
A method and apparatus for including in a computer system, instructions for performing cache memory invalidate and cache memory flush operations. In one embodiment, the computer system comprises a cache memory having a plurality of cache lines each of which stores data, and a storage area to store a data operand. An execution unit is coupled to the storage area, and operates on data elements in the data operand to invalidate data in a predetermined portion of the plurality of cache lines in response to receiving a single instruction.
摘要:
The present invention discloses a method and apparatus method for efficient utilization of write-combining buffers for a sequence of non-temporal stores to scattered locations. The method comprises: converting the sequence of non-temporal stores to stores to intermediate buffers; and grouping the stores to intermediate buffers into consecutive non-temporal stores. The consecutive non-temporal stores correspond to adjacent memory locations in the write-combining buffers.
摘要:
The present invention discloses a method and apparatus for processing strips of data, each strip referencing a plurality of parameter sets stored in a memory. The method comprises: prefetching a plurality of parameter sets referenced in a first strip; performing an operation on each of the prefetched parameter sets; and concatenating a first strip and a second strip to eliminate a memory access latency in the second strip.
摘要:
A method and apparatus for parallel processing of graphics data are described. A number of color components are stored in a floating point format in at least one register of a set of 128-bit registers in a packed format. The color components in the floating point format are converted to numbers in an integer format. The numbers in the integer format are placed in at least one register of a set of 64-bit registers in the packed format. Color components are assembled for image pixels from the numbers in the integer format.
摘要:
The present invention is directed to a method and apparatus for processing normalized meshes. The normalized meshes are formed by N polygons which have M vertices. M vertex coordinates are stored in a vertex array corresponding to the M vertices of the N polygons. N polygon indices are stored in an index array. Each of the N polygon indices references a predetermined number of the M vertex coordinates. A first subset of the index array having N1 polygon indices is determined. A second subset of the vertex array is selected such that the second subset contains M1 vertex coordinates corresponding entirely to the N1 polygon indices in the first subset. The second subset defines a window having a small size relative to the vertex array. The M1 vertex coordinates in the second subset are processed to generate processed data. The processed data are then concurrently sent to a graphics processor in an on-line manner.
摘要:
A method and instruction for converting a number between a floating point format and an integer format are described. Numbers are stored in the integer format in a register of a first set of architectural registers in a packed format. At least one of the numbers in the integer format is converted to at least one number in the floating point format. The numbers in the floating point format are placed in a register of a second set of architectural registers in a packed format.
摘要:
A method and instruction for converting a number between a floating point format and an integer format are described. Numbers are stored in the integer format in a register of a first set of architectural registers in a scalar format. At least one of the numbers in the scalar format is converted to a number in the floating point format. The number in the floating point format is placed in a register of a second set of architectural registers in a packed format.
摘要:
A method and instruction for converting a number from a floating point format to an integer format are described. Numbers are stored in the floating point format in a register of a first set of architectural registers in a packed format. At least one of the numbers in the floating point format is converted to at least one 16-bit number in the integer format. The 16-bit number in the integer format is placed in a register of a second set of architectural registers in the packed format.
摘要:
A method and instruction for converting a number between a floating point format and an integer format are described. Numbers are stored in the integer format in a register of a first set of architectural registers in a packed format. At least one of the numbers in the integer format is converted to at least one number in the floating point format. The numbers in the floating point format are placed in a register of a second set of architectural registers in a packed format.
摘要:
A method and instruction for converting a number from a floating point format to an integer format are described. Numbers are stored in the floating point format in a register of a first set of architectural registers in a packed format. At least one of the numbers in the floating point format is converted to at least one 8-bit number in the integer format. The 8-bit number in the integer format is placed in a register of a second set of architectural registers in the packed format.