专利检索 ap:("Intel Corporation") AND inv:"Amit Gradstein" 第 9 页

81.

发明授权
Efficient implementation of complex vector fused multiply add and complex vector multiply 有权

公开(公告)号：US10521226B2

公开(公告)日：2019-12-31

申请号：US15941531

申请日：2018-03-30

申请人： Intel Corporation

发明人： Raanan Sade , Thierry Pons , Amit Gradstein , Zeev Sperber , Mark J. Charney , Robert Valentine , Eyal Oz-Sinay

IPC分类号： G06F9/30 , G06F17/16 , G06F9/38

摘要： Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.

82.

发明授权
Systems, apparatuses, and methods for performing a double blocked sum of absolute differences 有权

公开(公告)号：US10303471B2

公开(公告)日：2019-05-28

申请号：US15445741

申请日：2017-02-28

申请人： Intel Corporation

发明人： Elmoustapha Ould-Ahmed-Vall , Mostafa Hagog , Robert Valentine , Amit Gradstein , Simon Rubanovich , Zeev Sperber

IPC分类号： G06F9/302 , G06F7/544 , G06F15/78 , G06F9/30 , G06F9/38 , G06F7/50

摘要： Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.

83.

发明授权
Processors, methods, systems, and instructions to generate sequences of integers in which integers in consecutive positions differ by a constant integer stride and where a smallest integer is offset from zero by an integer offset 有权

公开(公告)号：US10223111B2

公开(公告)日：2019-03-05

申请号：US15721796

申请日：2017-09-30

申请人： Intel Corporation

发明人： Seth Abraham , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30 , G06F9/345

摘要： A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.

84.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR DOT PRODUCT OPERATIONS 审中-公开

公开(公告)号：US20190042541A1

公开(公告)日：2019-02-07

申请号：US15859271

申请日：2017-12-29

申请人： Intel Corporation

发明人： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman

IPC分类号： G06F17/16 , G06F9/30

摘要： Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a quadword data elements of a matrix pair. Additionally, in some instances, non-accumulating quadword data elements of the matrix pair are set to zero.

85.

发明授权
Floating point (FP) add low instructions functional unit 有权

公开(公告)号：US09996319B2

公开(公告)日：2018-06-12

申请号：US14998366

申请日：2015-12-23

申请人： Intel Corporation

发明人： Cristina S. Anderson , Marius A. Cornea-Hasegan , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Nikita Astafev , Mark J. Charney , Milind B. Girkar , Amit Gradstein , Simon Rubanovich , Zeev Sperber

IPC分类号： G06F7/48 , G06F7/485

CPC分类号： G06F7/485

摘要： An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.

86.

发明申请
Instruction and Logic for Early Underflow Detection and Rounder Bypass 审中-公开

公开(公告)号：US20180088940A1

公开(公告)日：2018-03-29

申请号：US15280324

申请日：2016-09-29

申请人： Intel Corporation

发明人： Simon Rubanovich , Thierry Pons , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30

CPC分类号： G06F9/30014 , G06F7/00 , G06F7/483 , G06F7/5443

摘要： A processor for floating point underflow detection includes circuitry to decode a first instruction and a floating point unit. The decoded instruction, when executed by the processor, may be for performing a fused multiply-add (FMA) operation. The floating point unit includes circuitry to determine a non-normalized result of the first instruction based on a first input, a second input, and a third input. The floating point unit further includes circuitry to determine whether underflow exists in the non-normalized result based on a first exponent of the first input, a second exponent of the second input, and a third exponent of the third input.

87.

发明授权
Floating point scaling processors, methods, systems, and instructions 有权

公开(公告)号：US09921807B2

公开(公告)日：2018-03-20

申请号：US15262609

申请日：2016-09-12

申请人： Intel Corporation

发明人： Cristina S. Anderson , Amit Gradstein , Robert Valentine , Simon Rubanovich , Benny Eitan

IPC分类号： G06G7/48 , G06F7/483 , G06F9/30

CPC分类号： G06F7/483 , G06F9/30014 , G06F9/30036

摘要： A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.

88.

发明申请
FLOATING POINT (FP) ADD LOW INSTRUCTIONS FUNCTIONAL UNIT 有权

公开(公告)号：US20170185377A1

公开(公告)日：2017-06-29

申请号：US14998366

申请日：2015-12-23

申请人： Intel Corporation

发明人： Cristina S. Anderson , Marius A. Cornea-Hasegan , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Nikita Astafev , Mark J. Charney , Milind B. Girkar , Amit Gradstein , Simon Rubanovich , Zeev Sperber

IPC分类号： G06F7/485

CPC分类号： G06F7/485

摘要： An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.

89.

发明申请
METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE VECTOR PACKED TUPLE CROSS-COMPARISON FUNCTIONALITY 审中-公开
标题翻译：方法，装置，说明和逻辑提供向量包装的十字形跨比较功能

公开(公告)号：US20160188336A1

公开(公告)日：2016-06-30

申请号：US14588247

申请日：2014-12-31

申请人： Intel Corporation

发明人： Robert Valentine , Christopher J. Hughes , Mark J. Charney , Zeev Sperber , Amit Gradstein , Simon Rubanovich , Elmoustapha Ould-Ahmed-Vall , Yuri Gebil

IPC分类号： G06F9/30

CPC分类号： G06F9/30036 , G06F9/30018 , G06F9/30021 , G06F9/3834

摘要： Instructions and logic provide SIMD vector packed tuple cross-comparison functionality. Some processor embodiments include first and second registers with a variable plurality of data fields, each of the data fields to store an element of a first data type. The processor executes a SIMD instruction for vector packed tuple cross-comparison in some embodiments, which for each data field of a portion of data fields in a tuple of the first register, compares its corresponding element with every element of a corresponding portion of data fields in a tuple of the second register and sets a mask bit corresponding to each element of the second register portion, in a bit-mask corresponding to each unmasked element of the corresponding first register portion, according to the corresponding comparison. In some embodiments bit-masks are shifted by corresponding elements in data fields of a third register. The comparison type is indicated by an immediate operand.

摘要翻译： 指令和逻辑提供SIMD向量填充元组交叉比较功能。一些处理器实施例包括具有可变多个数据字段的第一和第二寄存器，每个数据字段用于存储第一数据类型的元素。在一些实施例中，处理器执行用于向量填充元组交叉比较的SIMD指令，对于第一寄存器的元组中的数据字段的一部分的每个数据字段，将其相应元素与数据字段的相应部分的每个元素进行比较在第二寄存器的元组中，根据相应的比较，在对应于相应的第一寄存器部分的每个未屏蔽元素的位掩码中设置对应于第二寄存器部分的每个元素的掩码位。在一些实施例中，位掩码由第三寄存器的数据字段中的相应元素移位。比较类型由即时操作数指示。

90.

发明申请
VECTOR MASK DRIVEN CLOCK GATING FOR POWER EFFICIENCY OF A PROCESSOR 审中-公开
标题翻译：矢量屏幕驱动时钟增益的处理器的功率效率

公开(公告)号：US20150220345A1

公开(公告)日：2015-08-06

申请号：US13997791

申请日：2012-12-19

申请人： INTEL CORPORATION

发明人： Jesus Corbal , Dennis R. Bradford , Jonathan C. Hall , Thomas D. Fletcher , Brian J. Hickmann , Dror Markovich , Amit Gradstein

IPC分类号： G06F9/38 , G06F9/30

CPC分类号： G06F9/3836 , G06F1/3243 , G06F1/329 , G06F9/3001 , G06F9/30036 , Y02D10/152 , Y02D10/24

摘要： A processor includes an instruction schedule and dispatch (schedule/dispatch) unit to receive a single instruction multiple data (SIMD) instruction to perform an operation on multiple data elements stored in a storage location indicated by a first source operand. The instruction schedule/dispatch unit is to determine a first of the data elements that will not be operated to generate a result written to a destination operand based on a second source operand. The processor further includes multiple processing elements coupled to the instruction schedule/dispatch unit to process the data elements of the SIMD instruction in a vector manner, and a power management unit coupled to the instruction schedule/dispatch unit to reduce power consumption of a first of the processing elements configured to process the first data element.

摘要翻译： 处理器包括指令调度和调度（调度/调度）单元，以接收单个指令多数据（SIMD）指令，以对存储在由第一源操作数指示的存储位置中的多个数据元素执行操作。指令调度/调度单元是基于第二源操作数来确定将不被操作以生成写入目的地操作数的结果的第一数据元素。处理器还包括耦合到指令调度/调度单元的多个处理单元，以矢量方式处理SIMD指令的数据单元，以及耦合到指令调度/调度单元的功率管理单元，以减少第一所述处理元件被配置为处理所述第一数据元素。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类