专利检索 ap:"Andrey Naraikin" 第 1 页

1.

发明申请
METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE PERMUTE CONTROLS WITH LEADING ZERO COUNT FUNCTIONALITY 审中-公开

公开(公告)号：US20180196672A1

公开(公告)日：2018-07-12

申请号：US15912498

申请日：2018-03-05

申请人： Christopher J. Hughes , Mikhail Plotnikov , Andrey Naraikin , Robert Valentine

发明人： Christopher J. Hughes , Mikhail Plotnikov , Andrey Naraikin , Robert Valentine

IPC分类号： G06F9/30 , G06F9/38

CPC分类号： G06F9/30145 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3834

摘要： Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

2.

发明申请
METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE PERMUTE CONTROLS WITH LEADING ZERO COUNT FUNCTIONALITY 有权
标题翻译：方法，设备，说明和逻辑提供带有领先零点功能的PTE控制

公开(公告)号：US20140189309A1

公开(公告)日：2014-07-03

申请号：US13731008

申请日：2012-12-29

申请人： Christopher J. Hughes , Mikhail Plotnikov , Andrey Naraikin , Robert Valentine

发明人： Christopher J. Hughes , Mikhail Plotnikov , Andrey Naraikin , Robert Valentine

IPC分类号： G06F9/30

CPC分类号： G06F9/30145 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3834

摘要： Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

摘要翻译： 说明和逻辑提供带有零计数功能的SIMD置换控制。一些实施例包括具有多个数据字段的寄存器的处理器，每个数据字段用于存储第二多个位。目的地寄存器具有对应的数据字段，这些数据字段中的每一个用于存储对于相应数据字段设置为零的最重要连续位数的计数。响应于对向量前导零计数指令进行解码，执行单元对寄存器中的每个数据字段计数设置为零的最高有效连续位的数目，并将计数存储在第一目的地寄存器的相应数据字段中。向量前导零计数指令可用于生成与该组置换控制一起使用的置换控制和完成掩码，以解决采集修改散射SIMD操作中的依赖关系。

3.

发明申请
SYSTEM, APPARATUS AND METHOD FOR LOOP REMAINDER MASK INSTRUCTION 审中-公开
标题翻译：系统，装置和方法用于环路保护掩码指令

公开(公告)号：US20140189296A1

公开(公告)日：2014-07-03

申请号：US13993323

申请日：2011-12-14

申请人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Andrey Naraikin , Suleyman Sair , Asaf Hargil , Miland B. Girkar , Bret T. Toll , Mark J. Charney

发明人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Andrey Naraikin , Suleyman Sair , Asaf Hargil , Miland B. Girkar , Bret T. Toll , Mark J. Charney

IPC分类号： G06F9/38

CPC分类号： G06F9/3887 , G06F8/4441 , G06F9/30018 , G06F9/30036 , G06F9/30065 , G06F9/30072 , G06F9/3013 , G06F9/325 , G06F9/3818 , G06F9/3824

摘要： A loop remainder mask instruction indicates a current iteration count of a loop as a first operand, an iteration limit of a loop as a second operand, and a destination. The loop contains iterations and each iteration includes a data element of the array. A processor receives the loop remainder mask instruction, decodes the instruction for execution, and stores a result of the execution in the destination. The result indicates a number of data elements of the array past an end of a preceding portion of the array that are to be handled separately from the preceding portion, the end of the preceding portion being where the current iteration count is recorded.

摘要翻译： 循环余数掩码指令指示作为第一操作数的循环的当前迭代计数，作为第二操作数的循环的迭代限制以及目的地。循环包含迭代，每次迭代都包含数组的数据元素。处理器接收循环余数掩码指令，解码执行指令，并将执行结果存储在目的地。结果表示阵列的数据元素数目超过阵列的前一部分的结尾，与前一部分分开处理，前一部分的结尾是当前迭代计数的记录。

4.

发明申请
LOOP VECTORIZATION METHODS AND APPARATUS 有权
标题翻译： LOOP VECTORIZATION方法和装置

公开(公告)号：US20140095850A1

公开(公告)日：2014-04-03

申请号：US13994549

申请日：2012-09-28

申请人： Mikhail Plotnikov , Andrey Naraikin , Christopher J. Hughes

发明人： Mikhail Plotnikov , Andrey Naraikin , Christopher J. Hughes

IPC分类号： G06F9/38

CPC分类号： G06F9/38 , G06F8/4441 , G06F8/452 , G06F9/30018 , G06F9/30036 , G06F15/8084

摘要： Loop vectorization methods and apparatus are disclosed. An example method includes generating a first control mask for a set of iterations of a loop by evaluating a condition of the loop, wherein generating the first control mask includes setting a bit of the control mask to a first value when the condition indicates that an operation of the loop is to be executed, and setting the bit of the first control mask to a second value when the condition indicates that the operation of the loop is to be bypassed. The example method also includes compressing indexes corresponding to the first set of iterations of the loop according to the first control mask.

摘要翻译： 公开了环向量化方法和装置。一个示例性方法包括：通过评估循环的条件来生成循环的一组迭代的第一控制掩码，其中产生所述第一控制掩码包括当所述条件指示操作时将所述控制掩码的位设置为第一值并且当条件指示要循环的操作被绕过时，将第一控制掩码的位设置为第二值。示例性方法还包括根据第一控制掩码压缩对应于循环的第一组迭代的索引。

5.

发明申请
METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE PERMUTE CONTROLS WITH LEADING ZERO COUNT FUNCTIONALITY 审中-公开

公开(公告)号：US20180196671A1

公开(公告)日：2018-07-12

申请号：US15912486

申请日：2018-03-05

申请人： Christopher J. Hughes , Mikhail Plotnikov , Andrey Naraikin , Robert Valentine

发明人： Christopher J. Hughes , Mikhail Plotnikov , Andrey Naraikin , Robert Valentine

IPC分类号： G06F9/30 , G06F9/38

CPC分类号： G06F9/30145 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3834

摘要： Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

6.

发明授权
Systems, apparatuses, and methods for performing a shuffle and operation (shuffle-op) 有权
标题翻译：用于执行洗牌和操作的系统，设备和方法（随机播放）

公开(公告)号：US09218182B2

公开(公告)日：2015-12-22

申请号：US13539116

申请日：2012-06-29

申请人： Igor Ermolaev , Elmoustapha Ould-Ahmed-Vall , Bret Toll , Jesus Corbal , Andrey Naraikin

发明人： Igor Ermolaev , Elmoustapha Ould-Ahmed-Vall , Bret Toll , Jesus Corbal , Andrey Naraikin

IPC分类号： G06F3/00 , G06F9/30

CPC分类号： G06F9/30036 , G06F9/3001 , G06F9/30029 , G06F9/30032 , G06F9/30098 , G06F9/30145 , G06F9/3016

摘要： Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.

摘要翻译： 用于在计算机处理器中响应于单个数据元素随机播放的数据元素随机播放和对混洗数据元素的操作的系统，装置和方法的实施例，以及包括目的地向量寄存器操作数，第一和第二描述源向量寄存器操作数，立即值和操作码。

7.

发明申请
UNIQUE PACKED DATA ELEMENT IDENTIFICATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS 审中-公开
标题翻译：独特的包装数据元素识别处理器，方法，系统和说明

公开(公告)号：US20140351567A1

公开(公告)日：2014-11-27

申请号：US13977686

申请日：2011-12-30

申请人： Mikhail Plotnikov , Andrey Naraikin , Elmoustapha Ould-Ahmed-Vall , Sergey Shalnov

发明人： Mikhail Plotnikov , Andrey Naraikin , Elmoustapha Ould-Ahmed-Vall , Sergey Shalnov

IPC分类号： G06F9/30

CPC分类号： G06F9/30018 , G06F9/30021 , G06F9/30036 , G06F9/3004 , G06F9/4552

摘要： A method of an aspect includes receiving a unique packed data element identification instruction. The unique packed data element identification instruction indicates a source packed data having a plurality of packed data elements and indicates a destination storage location. A unique packed data element identification result is stored in the destination storage location in response to the unique packed data element identification instruction. The unique packed data element identification result indicates which of the plurality of the packed data elements are unique in the source packed data. Other methods, apparatus, systems, and instructions are disclosed.

摘要翻译： 一种方面的方法包括接收唯一的打包数据元素识别指令。独特的打包数据元素识别指令指示具有多个打包数据元素的源打包数据，并且指示目的地存储位置。响应于唯一的打包数据元素识别指令，唯一的打包数据元素识别结果被存储在目的地存储位置中。独特的打包数据元素识别结果指示多个打包数据元素中的哪一个在源打包数据中是唯一的。公开了其它方法，装置，系统和指令。

8.

发明申请
Vectorization Of Collapsed Multi-Nested Loops 审中-公开
标题翻译：折叠多嵌套循环的向量化

公开(公告)号：US20140188961A1

公开(公告)日：2014-07-03

申请号：US13728439

申请日：2012-12-27

申请人： Mikhail Plotnikov , Andrey Naraikin , Elmoustapha Ould-Ahmed-Vall

发明人： Mikhail Plotnikov , Andrey Naraikin , Elmoustapha Ould-Ahmed-Vall

IPC分类号： G06F17/11

摘要： In an embodiment a method of vectorizing a collapsed multi-nested loop includes executing, in a vector unit of a processor, the collapsed loop to obtain a vector of offsets, including for each of a plurality of iterations, calculating a scalar offset into a multi-dimensional data structure, storing the scalar offset in a data element of a first vector register, and updating a loop counter value of a multi-dimensional loop counter vector. In turn, a plurality of data elements are loaded from the multi-dimensional data structure using a base value and indexes from the vector of offsets, at least one computation is performed on the loaded plurality of data elements to obtain a plurality of results, and the plurality of results are stored into the multi-dimensional data structure using the base value and the indexes from the vector of offsets. Other embodiments are described and claimed.

摘要翻译： 在一个实施例中，向量化折叠多嵌套循环的方法包括在处理器的向量单元中执行折叠循环以获得偏移向量，包括对于多个迭代中的每一个，将标量偏移计算为多将标量偏移存储在第一向量寄存器的数据元素中，以及更新多维循环计数器向量的循环计数器值。接着，使用基本值从多维数据结构中加载多个数据元素，并从偏移矢量进行索引，对被加载的多个数据元素进行至少一次计算以获得多个结果，以及使用基本值和来自偏移矢量的索引将多个结果存储到多维数据结构中。描述和要求保护其他实施例。

9.

发明申请
INSTRUCTION FOR SHIFTING BITS LEFT WITH PULLING ONES INTO LESS SIGNIFICANT BITS 有权
标题翻译：用于将位移的位置指示，将其移动到较小的重要位置

公开(公告)号：US20140095830A1

公开(公告)日：2014-04-03

申请号：US13630131

申请日：2012-09-28

申请人： Mikhail Plotnikov , Igor Ermolaev , Andrey Naraikin , Robert Valentine

发明人： Mikhail Plotnikov , Igor Ermolaev , Andrey Naraikin , Robert Valentine

IPC分类号： G06F9/315

CPC分类号： G06F9/30032 , G06F9/30018 , G06F9/30036 , G06F9/30065 , G06F9/30072 , G06F9/325

摘要： A mask generating instruction is executed by a processor to improve efficiency of vector operations on an array of data elements. The processor includes vector registers, one of which stores data elements of an array. The processor further includes execution circuitry to receive a mask generating instruction that specifies at least a first operand and a second operand. Responsive to the mask generating instruction, the execution circuitry is to shift bits of the first operand to the left by a number of times defined in the second operand, and pull in a bit of one from the right each time a most significant bit of the first operand is shifted out from the left to generate a result. Each bit in the result corresponds to one of the data elements of the array.

摘要翻译： 掩模生成指令由处理器执行以提高数据元素阵列上的向量操作的效率。处理器包括向量寄存器，其中一个存储阵列的数据元素。处理器还包括执行电路，用于接收指定至少第一操作数和第二操作数的掩码生成指令。响应于掩模生成指令，执行电路是将第一操作数的位向左移动在第二操作数中定义的次数，并且每次将最高有效位第一个操作数从左边移出来产生一个结果。结果中的每个位对应于数组的数据元素之一。

10.

发明申请
METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE PERMUTE CONTROLS WITH LEADING ZERO COUNT FUNCTIONALITY 审中-公开

公开(公告)号：US20180196670A1

公开(公告)日：2018-07-12

申请号：US15912468

申请日：2018-03-05

申请人： Christopher J. Hughes , Mikhail Plotnikov , Andrey Naraikin , Robert Valentine

发明人： Christopher J. Hughes , Mikhail Plotnikov , Andrey Naraikin , Robert Valentine

IPC分类号： G06F9/30 , G06F9/38

CPC分类号： G06F9/30145 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3834

摘要： Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类