专利检索 ap:"ElMoustapha Ould-Ahmed-Vall" 第 1 页

1.

发明申请
APPARATUS AND METHOD FOR PROCESSING STRUCTURE OF ARRAYS (SOA) AND ARRAY OF STRUCTURES (AOS) DATA 审中-公开

公开(公告)号：US20200097298A1

公开(公告)日：2020-03-26

申请号：US16140294

申请日：2018-09-24

申请人： CHRISTOPHER J. HUGHES , BRET TOLL , ALEXANDER HEINECKE , DAN BAUM , ELMOUSTAPHA OULD-AHMED-VALL , RAANAN SADE , ROBERT VALENTINE , MARK CHARNEY

发明人： CHRISTOPHER J. HUGHES , BRET TOLL , ALEXANDER HEINECKE , DAN BAUM , ELMOUSTAPHA OULD-AHMED-VALL , RAANAN SADE , ROBERT VALENTINE , MARK CHARNEY

IPC分类号： G06F9/38 , G06F9/30 , G06F15/80

摘要： An apparatus and method for processing array of structures (AoS) and structure of arrays (SoA) data. For example, one embodiment of a processor comprises: a destination tile register to store data elements in a structure of arrays (SoA) format; a first source tile register to store indices associated with the data elements; instruction fetch circuitry to fetch an array of structures (AoS) gather instruction comprising operands identifying the first source tile register and the destination tile register; a decoder to decode the AoS gather instruction; and execution circuitry to determine a plurality of system memory addresses based on the indices from the first source tile register, to read data elements from the system memory addresses in an AoS format, and to load the data elements to the destination tile register in an SoA format.

2.

发明授权
Processors, methods, systems, and instructions to generate sequences of consecutive integers in numerical order 有权

公开(公告)号：US10565283B2

公开(公告)日：2020-02-18

申请号：US13976766

申请日：2011-12-22

申请人： Seth Abraham , Robert Valentine , Elmoustapha Ould-Ahmed-Vall , Zeev Sperber , Amit Gradstein

发明人： Seth Abraham , Robert Valentine , Elmoustapha Ould-Ahmed-Vall , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30 , G06F17/10

摘要： A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four consecutive non-negative integers in numerical order. In an aspect, the instruction does not indicate a source packed data operand having a plurality of packed data elements in an architecturally-visible storage location. Other methods, apparatus, systems, and instructions are disclosed.

3.

发明申请
APPARATUS AND METHOD FOR PROCESSING RECIPROCAL SQUARE ROOT OPERATIONS 审中-公开

公开(公告)号：US20190196790A1

公开(公告)日：2019-06-27

申请号：US15850673

申请日：2017-12-21

申请人： Cristina ANDERSON , Elmoustapha OULD-AHMED-VALL , Marius CORNEA-HASEGAN , Robert VALENTINE , Mark CHARNEY , Jesus CORBAL , Venkateswara MADDURI

发明人： Cristina ANDERSON , Elmoustapha OULD-AHMED-VALL , Marius CORNEA-HASEGAN , Robert VALENTINE , Mark CHARNEY , Jesus CORBAL , Venkateswara MADDURI

IPC分类号： G06F7/552 , G06F9/30

摘要： An apparatus and method for performing a reciprocal square root. For example one embodiment of a processor comprises: a decoder to decode a reciprocal square root instruction to generate a decoded reciprocal square root instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal square root execution circuitry to execute the decoded reciprocal square root instruction, the reciprocal square root execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal square root execution circuitry to generate a reciprocal square root of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.

4.

发明申请
APPARATUS AND METHOD FOR PROCESSING FRACTIONAL RECIPROCAL OPERATIONS 审中-公开

公开(公告)号：US20190196789A1

公开(公告)日：2019-06-27

申请号：US15850636

申请日：2017-12-21

申请人： Cristina ANDERSON , Elmoustapha OULD-AHMED-VALL , Marius CORNEA-HASEGAN , Robert VALENTINE , Mark CHARNEY , Jesus CORBAL , Venkateswara MADDURI

发明人： Cristina ANDERSON , Elmoustapha OULD-AHMED-VALL , Marius CORNEA-HASEGAN , Robert VALENTINE , Mark CHARNEY , Jesus CORBAL , Venkateswara MADDURI

IPC分类号： G06F7/552 , G06F9/30

摘要： An apparatus and method for performing a reciprocal. For example one embodiment of a processor comprises: a decoder to decode a reciprocal instruction to generate a decoded reciprocal instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal execution circuitry to execute the decoded reciprocal instruction, the reciprocal execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal execution circuitry to generate a reciprocal of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.

5.

发明申请
Systems, Methods, and Apparatuses for Improving Vector Throughput 审中-公开

公开(公告)号：US20170192789A1

公开(公告)日：2017-07-06

申请号：US14984157

申请日：2015-12-30

申请人： Rama Kishnan V. Malladi , Elmoustapha Ould-Ahmed-Vall , Igor Ermolaev

发明人： Rama Kishnan V. Malladi , Elmoustapha Ould-Ahmed-Vall , Igor Ermolaev

IPC分类号： G06F9/38 , G06F9/30 , G06F15/80

CPC分类号： G06F9/384 , G06F9/30036 , G06F9/30109 , G06F9/30112

摘要： Detailed herein are systems, methods, and apparatuses for improving vector throughput. For example, an apparatus comprising a plurality of aliasable registers, wherein each of the plurality of aliasable registers is partitioned into a plurality of lanes and each lane is aliasable as a distinct register; and execution circuitry to execute instructions using data from the plurality of aliasable registers as input and output operands is described.

6.

发明申请
Systems, Apparatuses, and Methods for Lane-Based Strided Gather 审中-公开

公开(公告)号：US20170192784A1

公开(公告)日：2017-07-06

申请号：US14984233

申请日：2015-12-30

申请人： Elmoustapha Ould-Ahmed-Vall

发明人： Elmoustapha Ould-Ahmed-Vall

IPC分类号： G06F9/30

CPC分类号： G06F9/3016 , G06F9/30036 , G06F9/30043 , G06F9/3013 , G06F9/30192 , G06F9/3455

摘要： Embodiments of systems, apparatuses, and methods for lane-based strided gather are disclosed. In an embodiment, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields for indices of addresses to memory, and a packed data destination register operand; and execution circuitry to execute the decoded instruction to extract data elements of a defined number of types from memory using the indices of the instruction, and for each type, store the extracted data elements in one or more lanes of a packed data destination register dedicated to that type, wherein relative data elements between types are strided data elements apart.

7.

发明授权
Apparatus and method of improved permute instructions 有权

公开(公告)号：US09658850B2

公开(公告)日：2017-05-23

申请号：US13976993

申请日：2011-12-23

申请人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

发明人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30

CPC分类号： G06F9/30029 , G06F9/30018 , G06F9/30032 , G06F9/30036

摘要： An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

8.

发明授权
Apparatus and method of improved insert instructions 有权

公开(公告)号：US09619236B2

公开(公告)日：2017-04-11

申请号：US13976992

申请日：2011-12-23

申请人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

发明人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30

CPC分类号： G06F9/30181 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3013 , G06F9/30167 , G06F9/3802

摘要： An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

9.

发明授权
Systems, apparatuses, and methods for performing a double blocked sum of absolute differences 有权
标题翻译：用于执行绝对差异的双重阻塞和的系统，装置和方法

公开(公告)号：US09582464B2

公开(公告)日：2017-02-28

申请号：US13992229

申请日：2011-12-23

申请人： Elmoustapha Ould-Ahmed-Vall , Mostafa Hagog , Robert Valentine , Amit Gradstein , Simon Rubanovich , Zeev Sperber

发明人： Elmoustapha Ould-Ahmed-Vall , Mostafa Hagog , Robert Valentine , Amit Gradstein , Simon Rubanovich , Zeev Sperber

IPC分类号： G06F7/38 , G06F9/302 , G06F15/78 , G06F9/30 , G06F7/544 , G06F9/38 , G06F7/50

CPC分类号： G06F9/3001 , G06F7/50 , G06F7/544 , G06F9/30036 , G06F9/3836 , G06F9/3877 , G06F15/78 , G06F2207/5442

摘要： Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.

摘要翻译： 用于在计算机处理器中执行的系统，装置和方法的实施例响应于包括目的地向量寄存器操作数的绝对差指令的单向量双块压缩和，绝对差（SAD）的双块压缩和，第一和第二描述源操作数，立即数和操作码。

10.

发明授权
Systems, apparatuses, and methods for performing delta decoding on packed data elements 有权
标题翻译：用于对压缩数据元素执行增量解码的系统，装置和方法

公开(公告)号：US09557998B2

公开(公告)日：2017-01-31

申请号：US13997662

申请日：2011-12-28

申请人： Elmoustapha Ould-Ahmed-Vall , Thomas Willhalm , Tracy Garrett Drysdale

发明人： Elmoustapha Ould-Ahmed-Vall , Thomas Willhalm , Tracy Garrett Drysdale

IPC分类号： G06F9/30 , H04N19/42

CPC分类号： G06F9/30145 , G06F9/3001 , G06F9/30014 , G06F9/30018 , G06F9/30036 , G06F9/30105 , G06F9/30109 , G06F9/30112 , G06F9/3013 , H04N19/42

摘要： Systems, apparatuses, and methods for performing delta decoding on packed data elements of a source and storing the results in packed data elements of a destination using a single packed delta decode instruction are described. A processor may include a decoder to decode an instruction, and execution unit to execute the decoded instruction to calculate for each packed data element position of a source operand, other than a first packed data element position, a value that comprises a packed data element of that packed data element position and all packed data elements of packed data element positions that are of lesser significance, store a first packed data element from the first packed data element position of the source operand into a corresponding first packed data element position of a destination operand, and for each calculated value, store the value into a corresponding packed data element position of the destination operand.

摘要翻译： 描述了用于对源的压缩数据元素执行增量解码并使用单个压缩增量解码指令将结果存储在目的地的打包数据元素中的系统，装置和方法。处理器可以包括用于对指令进行解码的解码器，以及执行单元，用于执行解码指令，以计算除第一打包数据元素位置以外的源操作数的每个压缩数据元素位置，该值包括打包数据元素位置和打包数据元素位置的所有压缩数据元素都不太重要，将来自源操作数的第一打包数据元素位置的第一打包数据元素存储到目的地操作数的对应的第一打包数据元素位置，并且对于每个计算值，将该值存储到目的地操作数的对应的打包数据元素位置。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类