Alignment and ordering of vector elements for single instruction multiple data processing
    1.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 有权
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US07793077B2

    公开(公告)日:2010-09-07

    申请号:US11702659

    申请日:2007-02-06

    IPC分类号: G06F9/34

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Alignment and Ordering of Vector Elements for Single Instruction Multiple Data Processing
    2.
    发明申请
    Alignment and Ordering of Vector Elements for Single Instruction Multiple Data Processing 审中-公开
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US20110055497A1

    公开(公告)日:2011-03-03

    申请号:US12875268

    申请日:2010-09-03

    IPC分类号: G06F12/00

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Alignment and ordering of vector elements for single instruction multiple data processing
    3.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 有权
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US06266758B1

    公开(公告)日:2001-07-24

    申请号:US09263798

    申请日:1999-03-05

    IPC分类号: G06F1500

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Alignment and ordering of vector elements for single instruction multiple data processing
    4.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 有权
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US07197625B1

    公开(公告)日:2007-03-27

    申请号:US09662832

    申请日:2000-09-15

    IPC分类号: G06F15/80 G06F9/312

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Alignment and ordering of vector elements for single instruction
multiple data processing
    5.
    发明授权
    Alignment and ordering of vector elements for single instruction multiple data processing 失效
    用于单指令多数据处理的向量元素的对齐和排序

    公开(公告)号:US5933650A

    公开(公告)日:1999-08-03

    申请号:US947649

    申请日:1997-10-09

    摘要: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register. Then, a subset of elements are selected from the first register and the second register. The elements from the subset are then replicated into the elements in the third register in a particular order suitable for subsequent SIMD vector processing.

    摘要翻译: 本发明提供用于SIMD处理的向量元素的对准和排序。 在用于SIMD处理的向量元素的对齐中,一个向量从存储器单元加载到第一寄存器中,另一个向量从存储器单元加载到第二寄存器中。 第一个向量包含要生成的对齐向量的第一个字节。 然后,确定指定对齐向量的第一个字节的起始字节。 接下来,从第一寄存器提取向量,并且从第一寄存器的第一字节的第一位开始的第二寄存器继续通过第二寄存器中的位。 最后,将所提取的矢量复制到第三寄存器中,使得第三寄存器包含对准用于SIMD处理的多个元素。 在用于SIMD处理的向量元素的排序中,将第一向量从存储器单元加载到第一寄存器中,并且将第二向量从存储器单元加载到第二寄存器中。 然后,从第一寄存器和第二寄存器中选择元件的子集。 然后将来自子集的元素以适合于随后的SIMD向量处理的特定顺序复制到第三寄存器中的元素中。

    Providing extended precision in SIMD vector arithmetic operations
    6.
    发明授权
    Providing extended precision in SIMD vector arithmetic operations 有权
    提供SIMD向量算术运算的扩展精度

    公开(公告)号:US08074058B2

    公开(公告)日:2011-12-06

    申请号:US12480414

    申请日:2009-06-08

    IPC分类号: G06F15/00

    摘要: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into first and second vector registers, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, the first vector register and the second vector register are read from the register file. The present invention executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The resulting element of the execution is then written into the accumulator. Then, the resulting element is transformed into an N-bit width element and written into a third register for further operation or storage in memory. The transformation of the resulting element can include, for example, rounding, clamping, and/or shifting the element.

    摘要翻译: 本发明在具有寄存器文件和累加器的处理器中提供SIMD算术运算的扩展精度。 第一组数据元素和第二组数据元素分别被加载到第一和第二向量寄存器中。 每个数据元素包括N位。 接下来,从存储器中取出算术指令。 算术指令被解码。 然后,从寄存器文件读取第一向量寄存器和第二向量寄存器。 本发明对第一和第二向量寄存器中的相应数据元素执行算术指令。 然后将执行的结果元素写入累加器。 然后,将所得到的元素变换为N位元素,并写入第三寄存器以进一步操作或存储在存储器中。 所得到的元件的变换可以包括例如舍入,夹紧和/或移动元件。

    Providing extended precision in SIMD vector arithmetic operations
    7.
    发明授权
    Providing extended precision in SIMD vector arithmetic operations 有权
    提供SIMD向量算术运算中的扩展精度

    公开(公告)号:US07546443B2

    公开(公告)日:2009-06-09

    申请号:US11337440

    申请日:2006-01-24

    IPC分类号: G06F15/00

    摘要: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into first and second vector registers, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, the first vector register and the second vector register are read from the register file. The present invention executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The resulting element of the execution is then written into the accumulator. Then, the resulting element is transformed into an N-bit width element and written into a third register for further operation or storage in memory. The transformation of the resulting element can include, for example, rounding, clamping, and/or shifting the element.

    摘要翻译: 本发明在具有寄存器文件和累加器的处理器中提供SIMD算术运算的扩展精度。 第一组数据元素和第二组数据元素分别被加载到第一和第二向量寄存器中。 每个数据元素包括N位。 接下来,从存储器中取出算术指令。 算术指令被解码。 然后,从寄存器文件读取第一向量寄存器和第二向量寄存器。 本发明对第一和第二向量寄存器中的相应数据元素执行算术指令。 然后将执行的结果元素写入累加器。 然后,将所得到的元素变换为N位元素,并写入第三寄存器以进一步操作或存储在存储器中。 所得到的元件的变换可以包括例如舍入,夹紧和/或移动元件。

    Method for providing extended precision in SIMD vector arithmetic operations
    8.
    发明授权
    Method for providing extended precision in SIMD vector arithmetic operations 有权
    在SIMD向量算术运算中提供扩展精度的方法

    公开(公告)号:US07159100B2

    公开(公告)日:2007-01-02

    申请号:US09223046

    申请日:1998-12-30

    IPC分类号: G06F15/00

    摘要: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into a first vector register and a second vector register, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, a first vector register and a second vector register are read from the register file. The present invention then executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The result of the execution is then written into the accumulator. Then, each element in the accumulator is transformed into an N-bit width element and stored into the memory.

    摘要翻译: 本发明在具有寄存器文件和累加器的处理器中提供SIMD算术运算的扩展精度。 第一组数据元素和第二组数据元素分别被加载到第一向量寄存器和第二向量寄存器中。 每个数据元素包括N位。 接下来,从存储器中取出算术指令。 算术指令被解码。 然后,从寄存器文件读取第一向量寄存器和第二向量寄存器。 然后,本发明对第一和第二向量寄存器中的相应数据元素执行算术指令。 然后将执行结果写入累加器。 然后,累加器中的每个元素被变换成N位元素并存储到存储器中。

    Providing Extended Precision in SIMD Vector Arithmetic Operations
    9.
    发明申请
    Providing Extended Precision in SIMD Vector Arithmetic Operations 有权
    在SIMD矢量算术运算中提供扩展精度

    公开(公告)号:US20090249039A1

    公开(公告)日:2009-10-01

    申请号:US12480414

    申请日:2009-06-08

    IPC分类号: G06F9/302

    摘要: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into first and second vector registers, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, the first vector register and the second vector register are read from the register file. The present invention executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The resulting element of the execution is then written into the accumulator. Then, the resulting element is transformed into an N-bit width element and written into a third register for further operation or storage in memory. The transformation of the resulting element can include, for example, rounding, clamping, and/or shifting the element.

    摘要翻译: 本发明在具有寄存器文件和累加器的处理器中提供SIMD算术运算的扩展精度。 第一组数据元素和第二组数据元素分别被加载到第一和第二向量寄存器中。 每个数据元素包括N位。 接下来,从存储器中取出算术指令。 算术指令被解码。 然后,从寄存器文件读取第一向量寄存器和第二向量寄存器。 本发明对第一和第二向量寄存器中的相应数据元素执行算术指令。 然后将执行的结果元素写入累加器。 然后,将所得到的元素变换为N位元素,并写入第三寄存器以进一步操作或存储在存储器中。 所得到的元件的变换可以包括例如舍入,夹紧和/或移动元件。

    Method for providing extended precision in SIMD vector arithmetic
operations
    10.
    发明授权
    Method for providing extended precision in SIMD vector arithmetic operations 失效
    在SIMD向量算术运算中提供扩展精度的方法

    公开(公告)号:US5864703A

    公开(公告)日:1999-01-26

    申请号:US947648

    申请日:1997-10-09

    摘要: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into a first vector register and a second vector register, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, a first vector register and a second vector register are read from the register file. The present invention then executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The result of the execution is then written into the accumulator. Then, each element in the accumulator is transformed into an N-bit width element and stored into the memory.

    摘要翻译: 本发明在具有寄存器文件和累加器的处理器中提供SIMD算术运算的扩展精度。 第一组数据元素和第二组数据元素分别被加载到第一向量寄存器和第二向量寄存器中。 每个数据元素包括N位。 接下来,从存储器中取出算术指令。 算术指令被解码。 然后,从寄存器文件读取第一向量寄存器和第二向量寄存器。 然后,本发明对第一和第二向量寄存器中的相应数据元素执行算术指令。 然后将执行结果写入累加器。 然后,累加器中的每个元素被变换成N位元素并存储到存储器中。