APPARATUS AND METHOD OF IMPROVED EXTRACT INSTRUCTIONS

    公开(公告)号:US20180081689A1

    公开(公告)日:2018-03-22

    申请号:US15809818

    申请日:2017-11-10

    Abstract: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non-overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non-overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity and second granularity.

    APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

    公开(公告)号:US20180074825A1

    公开(公告)日:2018-03-15

    申请号:US15809721

    申请日:2017-11-10

    Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

    COMMON ARCHITECTURAL STATE PRESENTATION FOR PROCESSOR HAVING PROCESSING CORES OF DIFFERENT TYPES
    4.
    发明申请
    COMMON ARCHITECTURAL STATE PRESENTATION FOR PROCESSOR HAVING PROCESSING CORES OF DIFFERENT TYPES 有权
    具有处理不同类型的处理器的通用建筑状态表示

    公开(公告)号:US20160299761A1

    公开(公告)日:2016-10-13

    申请号:US15181374

    申请日:2016-06-13

    Abstract: Methods and apparatuses relating to a common architectural state presentation for a processor having cores of different types are described. In one embodiment, a processor includes a first core, a second core, wherein the first core comprises a unique architectural state and a common architectural state with the second core, and circuitry to migrate a thread from said first core to said second core, said circuitry to migrate the common architectural state from the first core to the second core, and migrate the unique architectural state to a storage external from the second core

    Abstract translation: 描述与具有不同类型的核的处理器的公共架构状态呈现有关的方法和装置。 在一个实施例中,处理器包括第一核心,第二核心,其中所述第一核心包括独特的架构状态和与所述第二核心的共同架构状态,以及将线程从所述第一核心迁移到所述第二核心的电路, 将公共架构状态从第一核心迁移到第二核心的电路,并将独特架构状态迁移到从第二核心外部的存储器

    APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

    公开(公告)号:US20170300332A1

    公开(公告)日:2017-10-19

    申请号:US15476356

    申请日:2017-03-31

    Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

    EFFICIENT ZERO-BASED DECOMPRESSION
    6.
    发明申请

    公开(公告)号:US20170300326A1

    公开(公告)日:2017-10-19

    申请号:US15438712

    申请日:2017-02-21

    CPC classification number: G06F9/30018 G06F9/30036 H03M7/46

    Abstract: A processor core including a hardware decode unit to decode vector instructions for decompressing a run length encoded (RLE) set of source data elements and an execution unit to execute the decoded instructions. The execution unit generates a first mask by comparing set of source data elements with a set of zeros and then counts the trailing zeros in the mask. A second mask is made based on the count of trailing zeros. The execution unit then copies the set of source data elements to a buffer using the second mask and then reads the number of RLE zeros from the set of source data elements. The buffer is shifted and copied to a result and the set of source data elements is shifted to the right. If more valid data elements are in the set of source data elements this is repeated until all valid data is processed.

    MULTI-ELEMENT INSTRUCTION WITH DIFFERENT READ AND WRITE MASKS
    7.
    发明申请
    MULTI-ELEMENT INSTRUCTION WITH DIFFERENT READ AND WRITE MASKS 审中-公开
    具有不同读取和写入掩码的多元素指令

    公开(公告)号:US20170052783A1

    公开(公告)日:2017-02-23

    申请号:US15346531

    申请日:2016-11-08

    Abstract: A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.

    Abstract translation: 描述了一种包括从第一寄存器读取第一读取掩码的方法。 该方法还包括从第二寄存器或存储器位置读取第一向量操作数。 该方法还包括对第一向量操作数应用读取掩码以产生用于操作的一组元素。 该方法还包括执行设定元件的操作。 该方法还包括通过产生操作结果的多个实例来创建输出向量。 该方法还包括从第三寄存器读取第一写掩码,第一写掩码不同于第一读掩码。 该方法还包括针对输出向量应用写掩码以产生合成矢量。 该方法还包括将结果矢量写入目的地寄存器。

    INSTRUCTION EXECUTION THAT BROADCASTS AND MASKS DATA VALUES AT DIFFERENT LEVELS OF GRANULARITY

    公开(公告)号:US20220215117A1

    公开(公告)日:2022-07-07

    申请号:US17677958

    申请日:2022-02-22

    Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure. The execution unit also includes masking logic circuitry to mask the first replication data structure at a first granularity and mask the second replication data structure at a second granularity. The second granularity is twice as fine as the first granularity.

    INSTRUCTION EXECUTION THAT BROADCASTS AND MASKS DATA VALUES AT DIFFERENT LEVELS OF GRANULARITY

    公开(公告)号:US20190095643A1

    公开(公告)日:2019-03-28

    申请号:US16141283

    申请日:2018-09-25

    Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure. The execution unit also includes masking logic circuitry to mask the first replication data structure at a first granularity and mask the second replication data structure at a second granularity. The second granularity is twice as fine as the first granularity.

Patent Agency Ranking