专利检索 ap:("Rama Kishan V. Malladi" OR "Elmoustapha Ould-Ahmed-Vall") AND inv:"Elmoustapha Ould-Ahmed-Vall" 第 1 页

1.

发明授权
Systems, apparatuses, and methods for arithmetic recurrence 有权

公开(公告)号：US10120680B2

公开(公告)日：2018-11-06

申请号：US15396184

申请日：2016-12-30

申请人： Rama Kishan V. Malladi , Elmoustapha Ould-Ahmed-Vall

发明人： Rama Kishan V. Malladi , Elmoustapha Ould-Ahmed-Vall

IPC分类号： G06F9/30

摘要： Embodiments of systems, apparatuses, and methods for broadcast arithmetic in a processor are described. For example, execution circuitry executes a decoded instruction to broadcast a data value from a least significant packed data element position of a first packed data source operand to a plurality of arithmetic circuits and for each packed data element position of a second packed data source operand, other than a least significant packed data element position, perform the arithmetic operation defined by the instruction on a data value from that packed data element position of the second packed data source operand and all data values from packed data element positions of the second packed data source operand that are of lesser position significance to the broadcast data value from the least significant packed data element position of the first packed data source operand, and stores a result of each arithmetic operation into a packed data element position of the packed data destination operand that corresponds to a most significant packed data element position of the second packed data source operand.

2.

发明授权
Processors, methods, systems, and instructions to generate sequences of consecutive integers in numerical order 有权

公开(公告)号：US10565283B2

公开(公告)日：2020-02-18

申请号：US13976766

申请日：2011-12-22

申请人： Seth Abraham , Robert Valentine , Elmoustapha Ould-Ahmed-Vall , Zeev Sperber , Amit Gradstein

发明人： Seth Abraham , Robert Valentine , Elmoustapha Ould-Ahmed-Vall , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30 , G06F17/10

摘要： A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four consecutive non-negative integers in numerical order. In an aspect, the instruction does not indicate a source packed data operand having a plurality of packed data elements in an architecturally-visible storage location. Other methods, apparatus, systems, and instructions are disclosed.

3.

发明申请
Systems, Methods, and Apparatuses for Improving Vector Throughput 审中-公开

公开(公告)号：US20170192789A1

公开(公告)日：2017-07-06

申请号：US14984157

申请日：2015-12-30

申请人： Rama Kishnan V. Malladi , Elmoustapha Ould-Ahmed-Vall , Igor Ermolaev

发明人： Rama Kishnan V. Malladi , Elmoustapha Ould-Ahmed-Vall , Igor Ermolaev

IPC分类号： G06F9/38 , G06F9/30 , G06F15/80

CPC分类号： G06F9/384 , G06F9/30036 , G06F9/30109 , G06F9/30112

摘要： Detailed herein are systems, methods, and apparatuses for improving vector throughput. For example, an apparatus comprising a plurality of aliasable registers, wherein each of the plurality of aliasable registers is partitioned into a plurality of lanes and each lane is aliasable as a distinct register; and execution circuitry to execute instructions using data from the plurality of aliasable registers as input and output operands is described.

4.

发明申请
Systems, Apparatuses, and Methods for Lane-Based Strided Gather 审中-公开

公开(公告)号：US20170192784A1

公开(公告)日：2017-07-06

申请号：US14984233

申请日：2015-12-30

申请人： Elmoustapha Ould-Ahmed-Vall

发明人： Elmoustapha Ould-Ahmed-Vall

IPC分类号： G06F9/30

CPC分类号： G06F9/3016 , G06F9/30036 , G06F9/30043 , G06F9/3013 , G06F9/30192 , G06F9/3455

摘要： Embodiments of systems, apparatuses, and methods for lane-based strided gather are disclosed. In an embodiment, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields for indices of addresses to memory, and a packed data destination register operand; and execution circuitry to execute the decoded instruction to extract data elements of a defined number of types from memory using the indices of the instruction, and for each type, store the extracted data elements in one or more lanes of a packed data destination register dedicated to that type, wherein relative data elements between types are strided data elements apart.

5.

发明授权
Apparatus and method of improved permute instructions 有权

公开(公告)号：US09658850B2

公开(公告)日：2017-05-23

申请号：US13976993

申请日：2011-12-23

申请人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

发明人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30

CPC分类号： G06F9/30029 , G06F9/30018 , G06F9/30032 , G06F9/30036

摘要： An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

6.

发明授权
Apparatus and method of improved insert instructions 有权

公开(公告)号：US09619236B2

公开(公告)日：2017-04-11

申请号：US13976992

申请日：2011-12-23

申请人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

发明人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30

CPC分类号： G06F9/30181 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3013 , G06F9/30167 , G06F9/3802

摘要： An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

7.

发明授权
Systems, apparatuses, and methods for performing a double blocked sum of absolute differences 有权
标题翻译：用于执行绝对差异的双重阻塞和的系统，装置和方法

公开(公告)号：US09582464B2

公开(公告)日：2017-02-28

申请号：US13992229

申请日：2011-12-23

申请人： Elmoustapha Ould-Ahmed-Vall , Mostafa Hagog , Robert Valentine , Amit Gradstein , Simon Rubanovich , Zeev Sperber

发明人： Elmoustapha Ould-Ahmed-Vall , Mostafa Hagog , Robert Valentine , Amit Gradstein , Simon Rubanovich , Zeev Sperber

IPC分类号： G06F7/38 , G06F9/302 , G06F15/78 , G06F9/30 , G06F7/544 , G06F9/38 , G06F7/50

CPC分类号： G06F9/3001 , G06F7/50 , G06F7/544 , G06F9/30036 , G06F9/3836 , G06F9/3877 , G06F15/78 , G06F2207/5442

摘要： Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.

摘要翻译： 用于在计算机处理器中执行的系统，装置和方法的实施例响应于包括目的地向量寄存器操作数的绝对差指令的单向量双块压缩和，绝对差（SAD）的双块压缩和，第一和第二描述源操作数，立即数和操作码。

8.

发明授权
Systems, apparatuses, and methods for performing delta decoding on packed data elements 有权
标题翻译：用于对压缩数据元素执行增量解码的系统，装置和方法

公开(公告)号：US09557998B2

公开(公告)日：2017-01-31

申请号：US13997662

申请日：2011-12-28

申请人： Elmoustapha Ould-Ahmed-Vall , Thomas Willhalm , Tracy Garrett Drysdale

发明人： Elmoustapha Ould-Ahmed-Vall , Thomas Willhalm , Tracy Garrett Drysdale

IPC分类号： G06F9/30 , H04N19/42

CPC分类号： G06F9/30145 , G06F9/3001 , G06F9/30014 , G06F9/30018 , G06F9/30036 , G06F9/30105 , G06F9/30109 , G06F9/30112 , G06F9/3013 , H04N19/42

摘要： Systems, apparatuses, and methods for performing delta decoding on packed data elements of a source and storing the results in packed data elements of a destination using a single packed delta decode instruction are described. A processor may include a decoder to decode an instruction, and execution unit to execute the decoded instruction to calculate for each packed data element position of a source operand, other than a first packed data element position, a value that comprises a packed data element of that packed data element position and all packed data elements of packed data element positions that are of lesser significance, store a first packed data element from the first packed data element position of the source operand into a corresponding first packed data element position of a destination operand, and for each calculated value, store the value into a corresponding packed data element position of the destination operand.

摘要翻译： 描述了用于对源的压缩数据元素执行增量解码并使用单个压缩增量解码指令将结果存储在目的地的打包数据元素中的系统，装置和方法。处理器可以包括用于对指令进行解码的解码器，以及执行单元，用于执行解码指令，以计算除第一打包数据元素位置以外的源操作数的每个压缩数据元素位置，该值包括打包数据元素位置和打包数据元素位置的所有压缩数据元素都不太重要，将来自源操作数的第一打包数据元素位置的第一打包数据元素存储到目的地操作数的对应的第一打包数据元素位置，并且对于每个计算值，将该值存储到目的地操作数的对应的打包数据元素位置。

9.

发明授权
Instruction and logic to provide vector horizontal majority voting functionality 有权
标题翻译：提供向量横向多数投票功能的指令和逻辑

公开(公告)号：US09448794B2

公开(公告)日：2016-09-20

申请号：US13977735

申请日：2011-11-30

申请人： Elmoustapha Ould-Ahmed-Vall , Kshitij A. Doshi , Suleyman Sair , Charles R. Yount

发明人： Elmoustapha Ould-Ahmed-Vall , Kshitij A. Doshi , Suleyman Sair , Charles R. Yount

IPC分类号： G06F11/00 , G06F9/30 , G06F11/14 , G06F11/10

CPC分类号： G06F9/30036 , G06F7/22 , G06F7/544 , G06F9/30018 , G06F9/30021 , G06F9/30101 , G06F9/30145 , G06F9/3016 , G06F11/1048 , G06F11/1479

摘要： Instructions and logic provide vector horizontal majority voting functionality. Some embodiments, responsive to an instruction specifying: a destination operand, a size of the vector elements, a source operand, and a mask corresponding to a portion of the vector element data fields in the source operand; read a number of values from data fields of the specified size in the source operand, corresponding to the mask specified by the instruction and store a result value to that number of corresponding data fields in the destination operand, the result value computed from the majority of values read from the number of data fields of the source operand.

摘要翻译： 指令和逻辑提供向量横向多数投票功能。一些实施例，响应于指定目的地操作数，向量元素的大小，源操作数和对应于源操作数中的向量元素数据字段的一部分的掩码的指令; 从源操作数中的指定大小的数据字段读取一些数值，对应于指令指定的掩码，并将结果值存储到目标操作数中的相应数据字段数，从大多数从源操作数的数据字段数读取的值。

10.

发明授权
Systems, apparatuses, and methods for performing a shuffle and operation (shuffle-op) 有权
标题翻译：用于执行洗牌和操作的系统，设备和方法（随机播放）

公开(公告)号：US09218182B2

公开(公告)日：2015-12-22

申请号：US13539116

申请日：2012-06-29

申请人： Igor Ermolaev , Elmoustapha Ould-Ahmed-Vall , Bret Toll , Jesus Corbal , Andrey Naraikin

发明人： Igor Ermolaev , Elmoustapha Ould-Ahmed-Vall , Bret Toll , Jesus Corbal , Andrey Naraikin

IPC分类号： G06F3/00 , G06F9/30

CPC分类号： G06F9/30036 , G06F9/3001 , G06F9/30029 , G06F9/30032 , G06F9/30098 , G06F9/30145 , G06F9/3016

摘要： Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.

摘要翻译： 用于在计算机处理器中响应于单个数据元素随机播放的数据元素随机播放和对混洗数据元素的操作的系统，装置和方法的实施例，以及包括目的地向量寄存器操作数，第一和第二描述源向量寄存器操作数，立即值和操作码。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类