Apparatus and method for performing permute operations
    38.
    发明授权
    Apparatus and method for performing permute operations 有权
    用于执行置换操作的装置和方法

    公开(公告)号:US09513918B2

    公开(公告)日:2016-12-06

    申请号:US13995974

    申请日:2011-12-22

    IPC分类号: G06F9/30 G06F9/38

    摘要: An apparatus and method are described for permuting data elements with masking. For example, a method according to one embodiment includes the following operations: reading values from a mask data structure to determine whether masking is implemented for each data element of a destination operand; if masking not implemented for a particular data element, then selecting data elements from a first source operand and a second source operand based on index values stored in destination operand to be copied to data element positions within the destination operand, wherein any one of the data elements from either the first source operand and the second source operand may be copied to any one of the data element positions within the destination operand; and if masking is implemented for a particular data element of the destination operand, then performing a designated masking operation with respect to that particular data element.

    摘要翻译: 描述了用掩模来置换数据元素的装置和方法。 例如,根据一个实施例的方法包括以下操作:从掩模数据结构读取值以确定是否对目的地操作数的每个数据元素实施掩蔽; 如果对于特定数据元素没有被实现掩蔽,则基于存储在目的地操作数中的索引值从第一源操作数和第二源操作数中选择数据元素以被复制到目的地操作数中的数据元素位置,其中数据中的任何一个 可以将来自第一源操作数和第二源操作数的元素复制到目的地操作数中的任何一个数据元素位置; 并且如果针对目的地操作数的特定数据元素实现掩蔽,则对该特定数据元素执行指定的掩蔽操作。

    Instruction execution that broadcasts and masks data values at different levels of granularity
    39.
    发明授权
    Instruction execution that broadcasts and masks data values at different levels of granularity 有权
    指令执行,以不同的粒度级别广播和屏蔽数据值

    公开(公告)号:US09424327B2

    公开(公告)日:2016-08-23

    申请号:US13976433

    申请日:2011-12-23

    IPC分类号: G06F7/00 G06F17/30 G06F9/30

    摘要: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure. The execution unit also includes masking logic circuitry to mask the first replication data structure at a first granularity and mask the second replication data structure at a second granularity. The second granularity is twice as fine as the first granularity.

    摘要翻译: 描述了包括执行第一指令和第二指令的执行单元的装置。 执行单元包括输入寄存器空间,以在执行第一指令时存储待复制的第一数据结构,并且在执行第二指令时存储要复制的第二数据结构。 第一和第二数据结构都是打包数据结构。 第一打包数据结构的数据值是第二打包数据结构的数据值的两倍。 当执行第一指令以创建第一复制数据结构时,执行单元还包括复制第一数据结构的复制逻辑电路,以及在执行第二数据指令以创建第二复制数据结构时复制第二数据结构。 执行单元还包括掩蔽逻辑电路,以第一粒度掩蔽第一复制数据结构,并以第二粒度掩蔽第二复制数据结构。 第二粒度是第一粒度的两倍。

    INSTRUCTION AND LOGIC TO PERFORM A CENTRIFUGE OPERATION
    40.
    发明申请
    INSTRUCTION AND LOGIC TO PERFORM A CENTRIFUGE OPERATION 有权
    指导和逻辑执行离心操作

    公开(公告)号:US20160179539A1

    公开(公告)日:2016-06-23

    申请号:US14580069

    申请日:2014-12-22

    IPC分类号: G06F9/30

    摘要: A processing device implements a set of instructions to perform a centrifuge operation using vector or general purpose registers. In one embodiment, the centrifuge operation separates bits in a source register to opposing regions of a destination register based on a control mask, where each source register bit with a corresponding control mask value of one is written to one region in a destination register, while source register bits with a corresponding control mask value of zero are written to an opposing region of the destination register.

    摘要翻译: 处理装置实现一组指令以使用向量或通用寄存器执行离心机操作。 在一个实施例中,离心机操作基于控制掩模将源寄存器中的位分离到目标寄存器的相对区域,其中具有对应控制掩码值为1的每个源寄存器位被写入目的地寄存器中的一个区域,而 具有对应控制掩码值为零的源寄存器位被写入到目标寄存器的相对区域。