Instruction methods for performing data formatting while moving data
between memory and a vector register file
    1.
    发明授权
    Instruction methods for performing data formatting while moving data between memory and a vector register file 失效
    在存储器和向量寄存器文件之间移动数据的同时执行数据格式化的指令方法

    公开(公告)号:US5812147A

    公开(公告)日:1998-09-22

    申请号:US716972

    申请日:1996-09-20

    IPC分类号: G06F9/312 G06F9/315 G06F13/00

    摘要: Instruction methods for moving data between memory and a vector register file while performing data formatting. The methods are processed by a processor having a vector register file and a memory unit. The methods are useful in the graphics art because they allow more efficient movement and processing of raster formatted graphics data. The vector register file has a number of vector registers (e.g., 32) that each contain multi-bits of storage (e.g., 128 bits). In one class of instructions, eight byte locations within memory are simultaneously loaded into eight separate 16 bit locations within a register of the register file. The data can be integer or fraction and signed or unsigned. The data can also be stored from the register file back to memory. In a second class of instructions, alternate locations of a memory qaudword are selected and simultaneously loaded in the register file. In a third class, data is obtained across a word boundary by a first instruction that obtains a first part and a second instruction that obtains the remainder part crossing the boundary. In a last class of instruction transfers, a block (e.g., 8 16-bit.times.8 16-bit) of data is loaded from memory, stored in the register file and stored back into memory causing a transposition of the data block (16 cycles). A block (e.g., 8 16-bit.times.8 16-bit) of data is stored from the register file to memory, and loaded back into the register file causing a transposition of the data block (16 cycles).

    摘要翻译: 在执行数据格式化时,用于在存储器和矢量寄存器文件之间移动数据的指令方法。 该方法由具有向量寄存器文件和存储单元的处理器处理。 这些方法在图形艺术中是有用的,因为它们允许更有效地移动和处理光栅格式的图形数据。 向量寄存器文件具有多个向量寄存器(例如,32),每个向量寄存器包含多位存储(例如,128位)。 在一类指令中,存储器中的八个字节位置被同时加载到寄存器文件的寄存器内的八个单独的16位位置。 数据可以是整数或分数,有符号或无符号。 数据也可以从寄存器文件存储回存储器。 在第二类指令中,选择存储器字典的替代位置并同时加载到寄存器文件中。 在第三类中,通过获得获得跨越边界的剩余部分的第一部分和第二指令的第一指令跨字边界获得数据。 在最后一类指令传输中,将一个数据块(例如8位16位×16位)从存储器加载到存储器中,并存储在存储器中,导致数据块的转置(16个周期)。 数据块(例如8位16位×16位)从寄存器文件存储到存储器,并加载到寄存器文件中,导致数据块的转置(16个周期)。

    Circuit to separate and combine color space component data of a video
image
    2.
    发明授权
    Circuit to separate and combine color space component data of a video image 失效
    用于分离和组合视频图像的颜色空间分量数据的电路

    公开(公告)号:US5835729A

    公开(公告)日:1998-11-10

    申请号:US713600

    申请日:1996-09-13

    IPC分类号: H04N9/78 G06F17/00

    CPC分类号: H04N9/78

    摘要: A method and arrangement for separating interleaved luminance and chrominance color space components data in a single data stream with minimum CPU intervention is provided. In the separating circuit, the separating circuit receives as input a series of graphics/video image data composed of interleaved luminance and chrominance color space components at successive clock cycles. The separating circuit directs selected bytes of the graphics/video image data representing the luminance color space component to a first path wherein luminance component data received at two successive clock cycles are combined. Likewise, selected bytes of the graphics/video image data representing the chrominance color space component are directed to a second path wherein chrominance component data received at two successive clock cycles are combined. Then, the combined luminance and chrominance component data are output alternately. Conversely, a method and arrangement for interleaving luminance and chrominance color space components data in stored separately into a single data stream is also provided.

    摘要翻译: 提供了一种用于以最小的CPU干预分离单个数据流中的交错亮度和色度色空间分量数据的方法和装置。 在分离电路中,分离电路在连续的时钟周期作为输入接收由交错亮度和色度色彩空间分量组成的一系列图形/视频图像数据。 分离电路将表示亮度颜色空间分量的图形/视频图像数据的所选字节指向第一路径,其中在两个连续时钟周期接收的亮度分量数据被组合。 类似地,表示色度色彩空间分量的图形/视频图像数据的选定字节被引导到第二路径,其中以两个连续时钟周期接收的色度分量数据被组合。 然后,交替地输出组合的亮度和色度分量数据。 相反地​​,还提供了用于交织存储在单个数据流中的亮度和色度色空间分量数据的方法和装置。

    Alignment and ordering of vector elements for single instruction multiple data processing
    3.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 有权
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US07793077B2

    公开(公告)日:2010-09-07

    申请号:US11702659

    申请日:2007-02-06

    IPC分类号: G06F9/34

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Alignment and ordering of vector elements for single instruction multiple data processing
    4.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 有权
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US06266758B1

    公开(公告)日:2001-07-24

    申请号:US09263798

    申请日:1999-03-05

    IPC分类号: G06F1500

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Alignment and Ordering of Vector Elements for Single Instruction Multiple Data Processing
    5.
    发明申请
    Alignment and Ordering of Vector Elements for Single Instruction Multiple Data Processing 审中-公开
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US20110055497A1

    公开(公告)日:2011-03-03

    申请号:US12875268

    申请日:2010-09-03

    IPC分类号: G06F12/00

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Alignment and ordering of vector elements for single instruction multiple data processing
    6.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 有权
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US07197625B1

    公开(公告)日:2007-03-27

    申请号:US09662832

    申请日:2000-09-15

    IPC分类号: G06F15/80 G06F9/312

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Alignment and ordering of vector elements for single instruction
multiple data processing
    7.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 失效
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US5933650A

    公开(公告)日:1999-08-03

    申请号:US947649

    申请日:1997-10-09

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Providing extended precision in SIMD vector arithmetic operations
    8.
    发明授权
    Providing extended precision in SIMD vector arithmetic operations 有权
    提供SIMD向量算术运算的扩展精度

    公开(公告)号:US08074058B2

    公开(公告)日:2011-12-06

    申请号:US12480414

    申请日:2009-06-08

    IPC分类号: G06F15/00

    摘要: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into first and second vector registers, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, the first vector register and the second vector register are read from the register file. The present invention executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The resulting element of the execution is then written into the accumulator. Then, the resulting element is transformed into an N-bit width element and written into a third register for further operation or storage in memory. The transformation of the resulting element can include, for example, rounding, clamping, and/or shifting the element.

    摘要翻译: 本发明在具有寄存器文件和累加器的处理器中提供SIMD算术运算的扩展精度。 第一组数据元素和第二组数据元素分别被加载到第一和第二向量寄存器中。 每个数据元素包括N位。 接下来,从存储器中取出算术指令。 算术指令被解码。 然后,从寄存器文件读取第一向量寄存器和第二向量寄存器。 本发明对第一和第二向量寄存器中的相应数据元素执行算术指令。 然后将执行的结果元素写入累加器。 然后,将所得到的元素变换为N位元素,并写入第三寄存器以进一步操作或存储在存储器中。 所得到的元件的变换可以包括例如舍入,夹紧和/或移动元件。

    Providing extended precision in SIMD vector arithmetic operations
    9.
    发明授权
    Providing extended precision in SIMD vector arithmetic operations 有权
    提供SIMD向量算术运算中的扩展精度

    公开(公告)号:US07546443B2

    公开(公告)日:2009-06-09

    申请号:US11337440

    申请日:2006-01-24

    IPC分类号: G06F15/00

    摘要: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into first and second vector registers, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, the first vector register and the second vector register are read from the register file. The present invention executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The resulting element of the execution is then written into the accumulator. Then, the resulting element is transformed into an N-bit width element and written into a third register for further operation or storage in memory. The transformation of the resulting element can include, for example, rounding, clamping, and/or shifting the element.

    摘要翻译: 本发明在具有寄存器文件和累加器的处理器中提供SIMD算术运算的扩展精度。 第一组数据元素和第二组数据元素分别被加载到第一和第二向量寄存器中。 每个数据元素包括N位。 接下来,从存储器中取出算术指令。 算术指令被解码。 然后,从寄存器文件读取第一向量寄存器和第二向量寄存器。 本发明对第一和第二向量寄存器中的相应数据元素执行算术指令。 然后将执行的结果元素写入累加器。 然后,将所得到的元素变换为N位元素,并写入第三寄存器以进一步操作或存储在存储器中。 所得到的元件的变换可以包括例如舍入,夹紧和/或移动元件。

    Direct memory access apparatus for transferring a block of data having
discontinous addresses using an address calculating circuit
    10.
    发明授权
    Direct memory access apparatus for transferring a block of data having discontinous addresses using an address calculating circuit 失效
    用于使用地址计算电路传送具有不连续地址的数据块的直接存储器存取装置

    公开(公告)号:US06108722A

    公开(公告)日:2000-08-22

    申请号:US713602

    申请日:1996-09-13

    IPC分类号: G06F13/28 G06F13/14

    CPC分类号: G06F13/28

    摘要: A method and arrangement for a dma transfer mode having multiple transactions is provided. The invention generates a set of transaction entries for a DMA transfer each of which contains information related to the address and command instruction of a transaction. The transaction entries are stored in an address/cmd-output-FIFO. The invention negotiates for the control of the system bus. Upon gaining control of the bus, the commands and address relate to each transaction are sequentially place on the system bus. If the transaction is a read operation, data received back from the system bus is first stored in a data-in-FIFO before being sent to the desired destination. If the transaction is a write operation, the data to be transferred is first stored in a data-out-FIFO before being timely place on the system bus for transferring to the desired destination. In either case, the number of data words transferred is monitored to determine when a transaction is complete. The number of transactions carried out is also monitored to determine when a DMA transfer is complete.

    摘要翻译: 提供了具有多个事务的dma传送模式的方法和装置。 本发明生成一组用于DMA传输的事务条目,每个事务条目包含与事务的地址和命令指令相关的信息。 交易条目存储在地址/ cmd-output-FIFO中。 本发明协商用于控制系统总线。 在获得对总线的控制之后,与系统总线相关的命令和地址与每个事务相关。 如果事务是读取操作,则从系统总线接收的数据首先被存储在FIFO数据中,然后再发送到所需的目的地。 如果事务是写入操作,则要被传送的数据首先存储在数据输出FIFO中,然后及时放置在系统总线上以传送到所需目的地。 在这两种情况下,监视传输的数据字数,以确定交易何时完成。 还监控执行的事务数,以确定DMA传输何时完成。