Vector shuffle instructions operating on multiple lanes each having a plurality of data elements using a same set of per-lane control bits
    42.
    发明授权
    Vector shuffle instructions operating on multiple lanes each having a plurality of data elements using a same set of per-lane control bits 有权
    在多个通道上操作的矢量洗牌指令,每个通道使用相同的每通道控制位集合具有多个数据元素

    公开(公告)号:US08914613B2

    公开(公告)日:2014-12-16

    申请号:US13219418

    申请日:2011-08-26

    IPC分类号: G06F15/16 G06F9/30 G06F9/38

    摘要: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

    摘要翻译: 描述车道内向量随机操作。 在一个实施例中,混洗指令指定每通道控制位,源操作数和目的地操作数的字段,这些操作数具有相应的通道,每个通道被划分为多个数据元素的相应部分。 根据每通道控制位,从源操作数的每个通道的相应部分中选择数据元素的集合。 这些集合的元素被复制到目标操作数的每个通道的相应部分中的指定字段。 混洗指令的另一实施例还指定第二源操作数,所有操作数具有被划分为多个数据元素的相应通道。 根据每通道控制位选择的集合包含来自第一源操作数的每个通道部分的数据元素和来自第二源操作数的每个对应通道部分的数据元素。 将元素复制到目标操作数的每个通道中的指定字段。

    APPARATUS AND METHOD OF IMPROVED PERMUTE INSTRUCTIONS
    44.
    发明申请
    APPARATUS AND METHOD OF IMPROVED PERMUTE INSTRUCTIONS 有权
    改进的说明书的装置和方法

    公开(公告)号:US20130290687A1

    公开(公告)日:2013-10-31

    申请号:US13976993

    申请日:2011-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

    摘要翻译: 描述了具有指令执行逻辑电路的装置。 指令执行逻辑电路具有输入向量元素路由电路,以对三个不同的指令中的每一个执行以下操作:对于多个输出向量元素位置中的每一个,将输入向量元素从多个 可用于输出输出向量元素的输入向量元素位置。 输出向量元素和每个输入向量元素位置是三个不同指令的三个可用位宽之一。 该装置还包括耦合到输入向量元素路由电路以屏蔽由输入向量路由选择元件电路产生的数据结构的掩蔽层电路。 掩蔽层电路被设计为以与三个可用位宽对应的三个不同的粒度级别进行掩蔽。

    In-Lane Vector Shuffle Instructions
    45.
    发明申请
    In-Lane Vector Shuffle Instructions 审中-公开
    内线向量随机指令

    公开(公告)号:US20130212360A1

    公开(公告)日:2013-08-15

    申请号:US13838048

    申请日:2013-03-15

    IPC分类号: G06F9/30

    摘要: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

    摘要翻译: 描述车道内向量随机操作。 在一个实施例中,混洗指令指定每通道控制位,源操作数和目的地操作数的字段,这些操作数具有相应的通道,每个通道被划分为多个数据元素的相应部分。 根据每通道控制位,从源操作数的每个通道的相应部分中选择数据元素的集合。 这些集合的元素被复制到目标操作数的每个通道的相应部分中的指定字段。 混洗指令的另一实施例还指定第二源操作数,所有操作数具有被划分为多个数据元素的相应通道。 根据每通道控制位选择的集合包含来自第一源操作数的每个通道部分的数据元素和来自第二源操作数的每个对应通道部分的数据元素。 将元素复制到目标操作数的每个通道中的指定字段。

    IN-LANE VECTOR SHUFFLE INSTRUCTIONS
    46.
    发明申请
    IN-LANE VECTOR SHUFFLE INSTRUCTIONS 有权
    在线路向量小指示

    公开(公告)号:US20090172358A1

    公开(公告)日:2009-07-02

    申请号:US11967211

    申请日:2007-12-30

    IPC分类号: G06F9/30

    摘要: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

    摘要翻译: 描述车道内向量随机操作。 在一个实施例中,混洗指令指定每通道控制位,源操作数和目的地操作数的字段,这些操作数具有相应的通道,每个通道被划分为多个数据元素的相应部分。 根据每通道控制位,从源操作数的每个通道的相应部分中选择数据元素的集合。 这些集合的元素被复制到目标操作数的每个通道的相应部分中的指定字段。 混洗指令的另一实施例还指定第二源操作数,所有操作数具有被划分为多个数据元素的相应通道。 根据每通道控制位选择的集合包含来自第一源操作数的每个通道部分的数据元素和来自第二源操作数的每个对应通道部分的数据元素。 将元素复制到目标操作数的每个通道中的指定字段。

    Apparatus and method for redundant zero micro-operation removal
    47.
    发明授权
    Apparatus and method for redundant zero micro-operation removal 失效
    用于冗余零微操作去除的装置和方法

    公开(公告)号:US07213136B2

    公开(公告)日:2007-05-01

    申请号:US10631628

    申请日:2003-07-30

    IPC分类号: G06F9/24

    摘要: A method and apparatus for redundant zero micro-operation removal. In one embodiment, the method includes the identification of a predetermined macro-instruction. Once identified, a value associated with a source register operand of the identified macro-instruction is determined. Once determined, the identified macro-instruction is decoded into a first macro operation and a second micro-operation if the determined value is not set. Otherwise, the identified macro-instruction is decoded into a single micro-operation if the determined value is set. Accordingly, the method described prevents the generation of redundant micro-operations that use valuable resources, such as allocation slots, as well as execution units within the processor core.

    摘要翻译: 一种用于冗余零微操作移除的方法和装置。 在一个实施例中,该方法包括对预定宏指令的识别。 一旦确定,确定与所识别的宏指令的源寄存器操作数相关联的值。 一旦确定,如果未设置确定的值,则将所识别的宏指令解码为第一宏操作和第二微操作。 否则,如果确定的值被设置,则所识别的宏指令被解码为单个微操作。 因此,所描述的方法防止了使用诸如分配时隙的有价值的资源以及处理器核心内的执行单元的冗余微操作的产生。

    Apparatus and method for two micro-operation flow using source override
    48.
    发明申请
    Apparatus and method for two micro-operation flow using source override 失效
    使用源超控的两个微操作流的装置和方法

    公开(公告)号:US20050027967A1

    公开(公告)日:2005-02-03

    申请号:US10631629

    申请日:2003-07-30

    IPC分类号: G06F9/30 G06F9/318 G06F9/38

    摘要: A method and apparatus for a two micro-operation flow using source override. In one embodiment, the method includes the identification of a macro-instruction having one or more streaming single instruction multiple data extension type operands. Once received, the macro-instruction is decoded into a first micro-operation (uOP) and a second uOP. Once decoded, a signal is asserted to disable source operand override logic if the first micro-operation updates a logical destination register that matches a logical source register of the micro-operation. Otherwise, the mutual source override is active and executed by a register alias table (RAT) when uOP with matching logic source and destination register are detected in a same clock cycle. In doing so, macro-instructions having 128-bit operands may be processed using, for example, two uOPs (one for the lower half and one for the upper half) in a 64-bit implementation, while preserving the atomicity of the original instruction.

    摘要翻译: 一种使用源超控的两个微操作流的方法和装置。 在一个实施例中,该方法包括识别具有一个或多个流单个指令多个数据扩展类型操作数的宏指令。 一旦接收到,宏指令被解码成第一微操作(uop)和第二uop。 一旦被解码,如果第一微操作更新与微操作的逻辑源寄存器匹配的逻辑目标寄存器,则信号被断言以禁用源操作数覆盖逻辑。 否则,当在相同的时钟周期中检测到具有匹配逻辑源和目标寄存器的UOP时,互源替代是激活的并由寄存器别名表(RAT)执行。 这样,具有128位操作数的宏指令可以在64位实现中使用例如两个uOP(一个用于下半部分,一个用于上半部分)来处理,同时保持原始指令的原子性 。

    Apparatus and method for redundant zero micro-operation removal
    49.
    发明申请
    Apparatus and method for redundant zero micro-operation removal 失效
    用于冗余零微操作去除的装置和方法

    公开(公告)号:US20050027964A1

    公开(公告)日:2005-02-03

    申请号:US10631628

    申请日:2003-07-30

    摘要: A method and apparatus for redundant zero micro-operation removal. In one embodiment, the method includes the identification of a predetermined macro-instruction. Once identified, a value associated with a source register operand of the identified macro-instruction is determined. Once determined, the identified macro-instruction is decoded into a first macro operation and a second micro-operation if the determined value is not set. Otherwise, the identified macro-instruction is decoded into a single micro-operation if the determined value is set. Accordingly, the method described prevents the generation of redundant micro-operations that use valuable resources, such as allocation slots, as well as execution units within the processor core.

    摘要翻译: 一种用于冗余零微操作移除的方法和装置。 在一个实施例中,该方法包括对预定宏指令的识别。 一旦确定,确定与所识别的宏指令的源寄存器操作数相关联的值。 一旦确定,如果未设置确定的值,则将所识别的宏指令解码为第一宏操作和第二微操作。 否则,如果确定的值被设置,则所识别的宏指令被解码为单个微操作。 因此,所描述的方法防止了使用诸如分配时隙的有价值的资源以及处理器核心内的执行单元的冗余微操作的产生。