COLLAPSING OF MULTIPLE NESTED LOOPS, METHODS, AND INSTRUCTIONS

    公开(公告)号:US20210279061A1

    公开(公告)日:2021-09-09

    申请号:US17323409

    申请日:2021-05-18

    Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

    APPARATUSES, METHODS, AND SYSTEMS FOR ELEMENT SORTING OF VECTORS

    公开(公告)号:US20190146792A1

    公开(公告)日:2019-05-16

    申请号:US16249870

    申请日:2019-01-16

    Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.

    VERSATILE PACKED DATA COMPARISON PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
    3.
    发明申请
    VERSATILE PACKED DATA COMPARISON PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS 审中-公开
    多个包装数据比较处理器,方法,系统和指令

    公开(公告)号:US20150186141A1

    公开(公告)日:2015-07-02

    申请号:US14142849

    申请日:2013-12-29

    Abstract: A processor including a decode unit to decode a versatile packed data compare instruction to indicate a first source packed data operand to include a first plurality of data elements, a second source packed data operand to include a second plurality of corresponding data elements. The instruction to indicate a source comparison operation indication operand to include comparison operation indicators each to indicate a potentially different comparison operation for a different corresponding pair of data elements from the first and second source operands. An execution unit, in response to the instruction, to store a result in a destination storage location indicated by the instruction. Result to include result indicators each to correspond to a different one of the comparison operation indicators. Each result indicator to indicate a result of a comparison operation, indicated by the corresponding comparison operation indicator, performed on the corresponding pair of data elements.

    Abstract translation: 一种处理器,包括解码单元,用于解码多功能打包数据比较指令以指示第一源打包数据操作数以包括第一多个数据元素,第二源打包数据操作数以包括第二多个对应的数据元素。 用于指示源比较操作指示操作数的指令,以包括各自的比较操作指示,以指示来自第一和第二源操作数的不同对应数据元素对的可能不同的比较操作。 执行单元响应于该指令将结果存储在由指令指示的目的地存储位置中。 结果包括各个结果指标,以对应于不同的一个比较操作指标。 用于指示由对应的比较操作指示符指示的比较操作的结果的每个结果指示符对相应的一对数据元素执行。

    INSTRUCTION FOR ELEMENT OFFSET CALCULATION IN A MULTI-DIMENSIONAL ARRAY
    5.
    发明申请
    INSTRUCTION FOR ELEMENT OFFSET CALCULATION IN A MULTI-DIMENSIONAL ARRAY 审中-公开
    元素偏差计算在多维阵列中的指导

    公开(公告)号:US20170075691A1

    公开(公告)日:2017-03-16

    申请号:US15363785

    申请日:2016-11-29

    Abstract: An apparatus is described having functional unit logic circuitry. The functional unit logic circuitry has a first register to store a first input vector operand having an element for each dimension of a multi-dimensional data structure. Each element of the first vector operand specifying the size of its respective dimension. The functional unit has a second register to store a second input vector operand specifying coordinates of a particular segment of the multi-dimensional structure. The functional unit also has logic circuitry to calculate an address offset for the particular segment relative to an address of an origin segment of the multi-dimensional structure.

    Abstract translation: 描述了具有功能单元逻辑电路的装置。 功能单元逻辑电路具有第一寄存器以存储具有用于多维数据结构的每个维度的元素的第一输入向量操作数。 第一个向量操作数的每个元素指定其相应维度的大小。 功能单元具有第二寄存器,用于存储指定多维结构的特定段的坐标的第二输入向量操作数。 功能单元还具有逻辑电路,用于相对于多维结构的原点片段的地址计算特定片段的地址偏移。

    COLLAPSING OF MULTIPLE NESTED LOOPS, METHODS, AND INSTRUCTIONS

    公开(公告)号:US20180373538A1

    公开(公告)日:2018-12-27

    申请号:US16120983

    申请日:2018-09-04

    Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

    MULTI-ELEMENT INSTRUCTION WITH DIFFERENT READ AND WRITE MASKS
    7.
    发明申请
    MULTI-ELEMENT INSTRUCTION WITH DIFFERENT READ AND WRITE MASKS 审中-公开
    具有不同读取和写入掩码的多元素指令

    公开(公告)号:US20170052783A1

    公开(公告)日:2017-02-23

    申请号:US15346531

    申请日:2016-11-08

    Abstract: A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.

    Abstract translation: 描述了一种包括从第一寄存器读取第一读取掩码的方法。 该方法还包括从第二寄存器或存储器位置读取第一向量操作数。 该方法还包括对第一向量操作数应用读取掩码以产生用于操作的一组元素。 该方法还包括执行设定元件的操作。 该方法还包括通过产生操作结果的多个实例来创建输出向量。 该方法还包括从第三寄存器读取第一写掩码,第一写掩码不同于第一读掩码。 该方法还包括针对输出向量应用写掩码以产生合成矢量。 该方法还包括将结果矢量写入目的地寄存器。

    INSTRUCTION SET FOR ELIMINATING MISALIGNED MEMORY ACCESSES DURING PROCESSING OF AN ARRAY HAVING MISALIGNED DATA ROWS
    8.
    发明申请
    INSTRUCTION SET FOR ELIMINATING MISALIGNED MEMORY ACCESSES DURING PROCESSING OF AN ARRAY HAVING MISALIGNED DATA ROWS 有权
    用于在具有缺失数据线的阵列处理期间消除缺陷存储器访问的指令集

    公开(公告)号:US20160011870A1

    公开(公告)日:2016-01-14

    申请号:US14327534

    申请日:2014-07-09

    CPC classification number: G06F9/30036 G06F9/30032

    Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction format of the instruction specifies a first input vector, a second input vector and a third input operand. The instruction execution pipeline comprises an instruction decode stage to decode the instruction. The instruction execution pipeline includes a functional unit to execute the instruction. The functional unit includes a routing network to route a first contiguous group of elements from a first end of one of the input vectors to a second end of the instruction's resultant vector, and, route a second contiguous group of elements from a second end of the other of the input vectors to a first end of the instruction's resultant vector. The first and second ends are opposite vector ends. The first and second groups of contiguous elements are defined from the third input operand. The instruction is not capable of routing non-contiguous groups of elements from the input vectors to the instruction's resultant vector. A software pipeline that uses the instruction is also described

    Abstract translation: 描述了具有指令执行流水线的处理器。 指令执行流水线包括取指令的指令提取阶段。 指令的指令格式指定第一输入向量,第二输入向量和第三输入操作数。 指令执行流水线包括用于解码指令的指令解码级。 指令执行流水线包括执行指令的功能单元。 所述功能单元包括路由网络,用于将第一连续的元素组从所述输入向量之一的第一端路由到所述指令的合成向量的第二端,并且从所述第二连续的元素组的第二端 输入向量的其他输入到指令的合成向量的第一端。 第一和第二端是相反的矢量端。 从第三个输入操作数定义第一组和第二组连续元素。 该指令不能将不连续的元素组从输入向量路由到指令的合成向量。 还描述了使用该指令的软件流水线

    COLLAPSING OF MULTIPLE NESTED LOOPS, METHODS, AND INSTRUCTIONS

    公开(公告)号:US20190129721A1

    公开(公告)日:2019-05-02

    申请号:US16233955

    申请日:2018-12-27

    Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

    APPARATUSES, METHODS, AND SYSTEMS FOR MULTIPLE SOURCE BLEND OPERATIONS

    公开(公告)号:US20180088945A1

    公开(公告)日:2018-03-29

    申请号:US15274849

    申请日:2016-09-23

    CPC classification number: G06F9/30036 G06F9/30021

    Abstract: Systems, methods, and apparatuses relating to multiple source blend operations are described. In one embodiment, a processor is to execute an instruction to: receive a first input operand of a first input vector, a second input operand of a second input vector, and a third input operand of a third input vector, compare each element from the first input vector to each corresponding element of the second input vector to produce a first comparison vector, compare each element from the first input vector to each corresponding element of the third input vector to produce a second comparison vector, compare each element from the second input vector to each corresponding element of the third input vector to produce a third comparison vector, determine a middle value for each element position of the input vectors from the comparison vectors, and output the middle values into same element positions in an output vector.

Patent Agency Ranking