DEINTERLEAVE STRIDED DATA ELEMENTS PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

    公开(公告)号:US20180246722A1

    公开(公告)日:2018-08-30

    申请号:US15445577

    申请日:2017-02-28

    IPC分类号: G06F9/30

    摘要: A method performed by a processor includes receiving an instruction. The instruction indicating a source operand, indicating a stride, indicating at least one set of strided data element positions out of all sets of strided data element positions for the indicated stride, and indicating at least one destination packed data register. The method also includes storing, in response to the instruction, for each of the indicated at least one set of strided data element positions, a corresponding result packed data operand, in a corresponding destination packed data register of the processor. Each result packed data operand including a plurality of data elements, which are from the corresponding indicated set of strided data element positions of the source operand. The strided data element positions of the set are separated from one another by integer multiples of the indicated stride. Other methods, processors, systems, and machine readable media are also disclosed.

    Multi-element instruction with different read and write masks
    3.
    发明授权
    Multi-element instruction with different read and write masks 有权
    具有不同读写掩码的多元素指令

    公开(公告)号:US09489196B2

    公开(公告)日:2016-11-08

    申请号:US13997998

    申请日:2011-12-23

    IPC分类号: G06F7/76 G06F9/30

    摘要: A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.

    摘要翻译: 描述了一种包括从第一寄存器读取第一读取掩码的方法。 该方法还包括从第二寄存器或存储器位置读取第一向量操作数。 该方法还包括对第一向量操作数应用读取掩码以产生用于操作的一组元素。 该方法还包括执行设定元件的操作。 该方法还包括通过产生操作结果的多个实例来创建输出向量。 该方法还包括从第三寄存器读取第一写掩码,第一写掩码不同于第一读掩码。 该方法还包括针对输出向量应用写掩码以产生合成矢量。 该方法还包括将结果矢量写入目的地寄存器。

    METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE PERMUTE CONTROLS WITH LEADING ZERO COUNT FUNCTIONALITY
    4.
    发明申请
    METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE PERMUTE CONTROLS WITH LEADING ZERO COUNT FUNCTIONALITY 有权
    方法,设备,说明和逻辑提供带有领先零点功能的PTE控制

    公开(公告)号:US20140189309A1

    公开(公告)日:2014-07-03

    申请号:US13731008

    申请日:2012-12-29

    IPC分类号: G06F9/30

    摘要: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

    摘要翻译: 说明和逻辑提供带有零计数功能的SIMD置换控制。 一些实施例包括具有多个数据字段的寄存器的处理器,每个数据字段用于存储第二多个位。 目的地寄存器具有对应的数据字段,这些数据字段中的每一个用于存储对于相应数据字段设置为零的最重要连续位数的计数。 响应于对向量前导零计数指令进行解码,执行单元对寄存器中的每个数据字段计数设置为零的最高有效连续位的数目,并将计数存储在第一目的地寄存器的相应数据字段中。 向量前导零计数指令可用于生成与该组置换控制一起使用的置换控制和完成掩码,以解决采集修改散射SIMD操作中的依赖关系。

    LOOP VECTORIZATION METHODS AND APPARATUS
    5.
    发明申请
    LOOP VECTORIZATION METHODS AND APPARATUS 有权
    LOOP VECTORIZATION方法和装置

    公开(公告)号:US20140095850A1

    公开(公告)日:2014-04-03

    申请号:US13994549

    申请日:2012-09-28

    IPC分类号: G06F9/38

    摘要: Loop vectorization methods and apparatus are disclosed. An example method includes generating a first control mask for a set of iterations of a loop by evaluating a condition of the loop, wherein generating the first control mask includes setting a bit of the control mask to a first value when the condition indicates that an operation of the loop is to be executed, and setting the bit of the first control mask to a second value when the condition indicates that the operation of the loop is to be bypassed. The example method also includes compressing indexes corresponding to the first set of iterations of the loop according to the first control mask.

    摘要翻译: 公开了环向量化方法和装置。 一个示例性方法包括:通过评估循环的条件来生成循环的一组迭代的第一控制掩码,其中产生所述第一控制掩码包括当所述条件指示操作时将所述控制掩码的位设置为第一值 并且当条件指示要循环的操作被绕过时,将第一控制掩码的位设置为第二值。 示例性方法还包括根据第一控制掩码压缩对应于循环的第一组迭代的索引。

    Instruction for element offset calculation in a multi-dimensional array
    6.
    发明授权
    Instruction for element offset calculation in a multi-dimensional array 有权
    多维数组元素偏移计算指令

    公开(公告)号:US09507593B2

    公开(公告)日:2016-11-29

    申请号:US13976004

    申请日:2011-12-23

    IPC分类号: G06F9/30 G06F9/355 G06F9/38

    摘要: An apparatus is described having functional unit logic circuitry. The functional unit logic circuitry has a first register to store a first input vector operand having an element for each dimension of a multi-dimensional data structure. Each element of the first vector operand specifying the size of its respective dimension. The functional unit has a second register to store a second input vector operand specifying coordinates of a particular segment of the multi-dimensional structure. The functional unit also has logic circuitry to calculate an address offset for the particular segment relative to an address of an origin segment of the multi-dimensional structure.

    摘要翻译: 描述了具有功能单元逻辑电路的装置。 功能单元逻辑电路具有第一寄存器以存储具有用于多维数据结构的每个维度的元素的第一输入向量操作数。 第一个向量操作数的每个元素指定其相应维度的大小。 功能单元具有第二寄存器,用于存储指定多维结构的特定段的坐标的第二输入向量操作数。 功能单元还具有逻辑电路,用于相对于多维结构的原点片段的地址计算特定片段的地址偏移。

    Instruction for shifting bits left with pulling ones into less significant bits
    7.
    发明授权
    Instruction for shifting bits left with pulling ones into less significant bits 有权
    用于将位移位到较低有效位的指令

    公开(公告)号:US09122475B2

    公开(公告)日:2015-09-01

    申请号:US13630131

    申请日:2012-09-28

    IPC分类号: G06F9/30 G06F15/80

    摘要: A mask generating instruction is executed by a processor to improve efficiency of vector operations on an array of data elements. The processor includes vector registers, one of which stores data elements of an array. The processor further includes execution circuitry to receive a mask generating instruction that specifies at least a first operand and a second operand. Responsive to the mask generating instruction, the execution circuitry is to shift bits of the first operand to the left by a number of times defined in the second operand, and pull in a bit of one from the right each time a most significant bit of the first operand is shifted out from the left to generate a result. Each bit in the result corresponds to one of the data elements of the array.

    摘要翻译: 掩模生成指令由处理器执行以提高数据元素阵列上的向量操作的效率。 处理器包括向量寄存器,其中一个存储阵列的数据元素。 处理器还包括执行电路,用于接收指定至少第一操作数和第二操作数的掩码生成指令。 响应于掩模生成指令,执行电路是将第一操作数的位向左移动在第二操作数中定义的次数,并且每次将最高有效位 第一个操作数从左边移出来产生一个结果。 结果中的每个位对应于数组的数据元素之一。

    READ AND WRITE MASKS UPDATE INSTRUCTION FOR VECTORIZATION OF RECURSIVE COMPUTATIONS OVER INTERDEPENDENT DATA
    8.
    发明申请
    READ AND WRITE MASKS UPDATE INSTRUCTION FOR VECTORIZATION OF RECURSIVE COMPUTATIONS OVER INTERDEPENDENT DATA 有权
    读取和写入掩码更新指令用于通过相关数据进行重新计算

    公开(公告)号:US20140095837A1

    公开(公告)日:2014-04-03

    申请号:US13630247

    申请日:2012-09-28

    IPC分类号: G06F9/30

    摘要: A processor executes a mask update instruction to perform updates to a first mask register and a second mask register. A register file within the processor includes the first mask register and the second mask register. The processor includes execution circuitry to execute the mask update instruction. In response to the mask update instruction, the execution circuitry is to invert a given number of mask bits in the first mask register, and also to invert the given number of mask bits in the second mask register.

    摘要翻译: 处理器执行掩码更新指令以对第一屏蔽寄存器和第二掩码寄存器执行更新。 处理器内的寄存器文件包括第一掩码寄存器和第二掩码寄存器。 处理器包括执行掩膜更新指令的执行电路。 响应于掩码更新指令,执行电路将反转第一掩码寄存器中给定数量的掩码位,并且还反转第二掩码寄存器中给定数量的掩码位。

    VECTOR MOVE INSTRUCTION CONTROLLED BY READ AND WRITE MASKS
    9.
    发明申请
    VECTOR MOVE INSTRUCTION CONTROLLED BY READ AND WRITE MASKS 有权
    由读取和写入掩码控制的矢量移动指令

    公开(公告)号:US20140095828A1

    公开(公告)日:2014-04-03

    申请号:US13630118

    申请日:2012-09-28

    IPC分类号: G06F15/76

    CPC分类号: G06F15/8084 G06F9/3885

    摘要: A processor executes a vector move instruction to move data elements from a second vector register to a first vector register under the control of a first mask register and a second mask register. A register file within the processor includes the first vector register, the second vector register, the first mask register and the second mask register. In response to the vector move instruction, execution circuitry in the processor is to replace a given number of target data elements in the first vector register with the given number of source data elements in the second vector register. Each source data element corresponds to a mask bit in the second mask register having a second bit value, and wherein each target data element corresponds to a mask bit in the first mask register having a first bit value.

    摘要翻译: 处理器执行向量移动指令,以在第一屏蔽寄存器和第二屏蔽寄存器的控制下将数据元素从第二向量寄存器移动到第一向量寄存器。 处理器内的寄存器文件包括第一向量寄存器,第二向量寄存器,第一掩码寄存器和第二掩码寄存器。 响应于向量移动指令,处理器中的执行电路是用第二向量寄存器中的给定数量的源数据元素替换第一向量寄存器中给定数量的目标数据元素。 每个源数据元素对应于具有第二位值的第二掩码寄存器中的掩码位,并且其中每个目标数据元素对应于具有第一位值的第一掩码寄存器中的掩码位。