APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

    公开(公告)号:US20170300332A1

    公开(公告)日:2017-10-19

    申请号:US15476356

    申请日:2017-03-31

    Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

    APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

    公开(公告)号:US20180074825A1

    公开(公告)日:2018-03-15

    申请号:US15809721

    申请日:2017-11-10

    Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

    METHOD AND APPARATUS FOR PERFORMING A VECTOR BIT REVERSAL AND CROSSING

    公开(公告)号:US20180032334A1

    公开(公告)日:2018-02-01

    申请号:US15729566

    申请日:2017-10-10

    Abstract: An apparatus and method for performing a vector bit reversal and crossing. For example, one embodiment of a processor comprises: a first source vector register to store a first plurality of source bit groups, wherein a size for the bit groups is to be specified in an immediate of an instruction; a second source vector to store a second plurality of source bit groups; vector bit reversal and crossing logic to determine a bit group size from the immediate and to responsively reverse positions of contiguous bit groups within the first source vector register to generate a set of reversed bit groups, wherein the vector bit reversal and crossing logic is to additionally interleave the set of reversed bit groups with the second plurality of bit groups; and a destination vector register to store the reversed bit groups interleaved with the first plurality of bit groups.

    METHOD AND APPARATUS FOR PERFORMING A VECTOR BIT SHUFFLE
    6.
    发明申请
    METHOD AND APPARATUS FOR PERFORMING A VECTOR BIT SHUFFLE 审中-公开
    用于执行矢量位块的方法和装置

    公开(公告)号:US20160188532A1

    公开(公告)日:2016-06-30

    申请号:US14583636

    申请日:2014-12-27

    Abstract: An apparatus and method for performing a vector bit shuffle. For example, one embodiment of a processor comprises: a first vector register to store a plurality of source data elements; a second vector register to store a plurality of control elements, each of the control elements comprising a plurality of bit fields, each bit field to be associated with a corresponding bit position in a destination mask register and to identify a bit from each of the source data elements to be copied to each of the particular bit positions; and vector bit shuffle logic to read each bit field from the second vector register to identify a bit from each of the source data elements and to responsively copy the bit from each of the source data elements to each of the corresponding bit positions in the destination mask register.

    Abstract translation: 用于执行向量比特洗牌的装置和方法。 例如,处理器的一个实施例包括:第一向量寄存器,用于存储多个源数据元素; 用于存储多个控制元件的第二矢量寄存器,每个控制元件包括多个位域,每个位域与目的地掩模寄存器中的对应位位置相关联,并且从源中的每一个识别位 要复制到每个特定位位置的数据元素; 和向量位洗牌逻辑,以从第二向量寄存器读取每个位字段,以识别来自每个源数据元素的位,并且响应地将每个源数据元素中的位复制到目标掩码中的每个相应位位置 寄存器。

    METHOD AND APPARATUS FOR PERFORMING A VECTOR BIT GATHER
    7.
    发明申请
    METHOD AND APPARATUS FOR PERFORMING A VECTOR BIT GATHER 审中-公开
    用于执行矢量位加法器的方法和装置

    公开(公告)号:US20160188335A1

    公开(公告)日:2016-06-30

    申请号:US14583639

    申请日:2014-12-27

    CPC classification number: G06F9/30036 G06F9/30018 G06F9/30032 G06F9/30098

    Abstract: An apparatus and method for performing a vector bit gather. For example, one embodiment of a processor comprises: a first vector register to store one or more source data elements; a second vector register to store one or more control elements, each of the control elements comprising a plurality of bit fields, each bit field to be associated with a corresponding bit position in a destination vector register and to identify a bit from the one or more source data elements to be copied to each of the particular bit positions; and vector bit gather logic to read each bit field from the second vector register to identify a bit from the one or more source data elements and to responsively copy the bit from each of the one or more source data elements to each of the corresponding bit positions in the destination vector register.

    Abstract translation: 用于执行向量位聚合的装置和方法。 例如,处理器的一个实施例包括:第一向量寄存器,用于存储一个或多个源数据元素; 第二矢量寄存器,用于存储一个或多个控制元件,每个控制元件包括多个位域,每个位字段将与目的地向量寄存器中的相应位位置相关联,并且从一个或多个位 要复制到每个特定位位置的源数据元素; 和向量位采集逻辑,以从第二向量寄存器读取每个位域,以识别来自一个或多个源数据元素的位,并且响应地将该一个或多个源数据元素中的每个源的位复制到相应的位位置 在目的向量寄存器中。

    INSTRUCTION EXECUTION THAT BROADCASTS AND MASKS DATA VALUES AT DIFFERENT LEVELS OF GRANULARITY

    公开(公告)号:US20220215117A1

    公开(公告)日:2022-07-07

    申请号:US17677958

    申请日:2022-02-22

    Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure. The execution unit also includes masking logic circuitry to mask the first replication data structure at a first granularity and mask the second replication data structure at a second granularity. The second granularity is twice as fine as the first granularity.

    INSTRUCTION EXECUTION THAT BROADCASTS AND MASKS DATA VALUES AT DIFFERENT LEVELS OF GRANULARITY

    公开(公告)号:US20190095643A1

    公开(公告)日:2019-03-28

    申请号:US16141283

    申请日:2018-09-25

    Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure. The execution unit also includes masking logic circuitry to mask the first replication data structure at a first granularity and mask the second replication data structure at a second granularity. The second granularity is twice as fine as the first granularity.

    METHOD AND APPARATUS FOR PERFORMING A VECTOR BIT REVERSAL
    10.
    发明申请
    METHOD AND APPARATUS FOR PERFORMING A VECTOR BIT REVERSAL 有权
    用于执行向量位反转的方法和装置

    公开(公告)号:US20160179522A1

    公开(公告)日:2016-06-23

    申请号:US14581883

    申请日:2014-12-23

    CPC classification number: G06F9/30018 G06F9/30032 G06F9/30036

    Abstract: An apparatus and method for performing a vector bit reversal. For example, one embodiment of a processor comprises: a source vector register to store a plurality of source bit groups, wherein a size for the bit groups is to be specified in an immediate of an instruction; vector bit reversal logic to determine a bit group size from the immediate and to responsively reverse positions of contiguous bit groups within the source vector register to generate a set of reversed bit groups; and a destination vector register to store the reversed bit groups.

    Abstract translation: 用于执行向量比特反转的装置和方法。 例如,处理器的一个实施例包括:源向量寄存器,用于存储多个源位组,其中用于位组的大小将在指令的立即指定中; 矢量位反转逻辑,以从源向量寄存器内的邻近位组的立即和响应地反转位置确定位组大小,以产生一组反转位组; 以及存储反向位组的目的地向量寄存器。

Patent Agency Ranking