Patent search ap:("Intel Corporation") AND inv:"Warren E. Ferguson" Page 1

1.

发明授权
Systems, apparatuses, and methods for chained fused multiply add 有权

公开(公告)号：US10853065B2

公开(公告)日：2020-12-01

申请号：US16169456

申请日：2018-10-24

Applicant: Intel Corporation

Inventor： Jesus Corbal , Robert Valentine , Roman S. Dubtsov , Nikita A. Shustrov , Mark J. Charney , Dennis R. Bradford , Milind B. Girkar , Edward T. Grochowski , Thomas D. Fletcher , Warren E. Ferguson

IPC: G06F9/30 , G06F7/544 , G06F7/483

Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

2.

发明申请
APPARATUS AND METHOD FOR PERFORMING A CHECK TO OPTIMIZE INSTRUCTION FLOW 有权
Title translation: 执行检查以优化指导流量的装置和方法

公开(公告)号：US20160179515A1

公开(公告)日：2016-06-23

申请号：US14581815

申请日：2014-12-23

Applicant: INTEL CORPORATION

Inventor： Jesus Corbal San Adrian , Robert N. Hanek , Warren E. Ferguson , Taraneh Bahrami , Avi A. Tevet , Dennis R. Bradford , Michael Ferry , Jingwei Zhang

IPC: G06F9/30

CPC classification number: G06F9/3001 , G06F9/30014 , G06F9/30076 , G06F9/30145 , G06F9/30181 , G06F9/3861 , G06F9/4552 , G06F9/45525

Abstract: An apparatus and method for performing a check on inputs to a mathematical instruction and selecting a default sequence efficiently managing the architectural state of a processor. For example, one embodiment of a processor comprises: an arithmetic logic unit (ALU) to perform a plurality of mathematical operations using one or more source operands; instruction check logic to evaluate the source operands for a current mathematical instruction and to determine, based on the evaluation, whether to execute a default sequence of operations including executing the current mathematical instruction by the ALU or to jump to an alternate sequence of operations adapted to provide a result for the mathematical instruction having particular types of source operands more efficiently than the default sequence of operations.

Abstract translation: 一种用于对数学指令的输入进行检查并选择有效地管理处理器的架构状态的默认序列的装置和方法。例如，处理器的一个实施例包括：使用一个或多个源操作数执行多个数学运算的算术逻辑单元（ALU）; 指令检查逻辑以评估当前数学指令的源操作数，并且基于评估来确定是否执行默认操作序列，包括由ALU执行当前数学指令或跳转到适于为具有比默认操作序列更有效的特定类型的源操作数的数学指令提供结果。

3.

发明授权
Systems, apparatuses, and methods for chained fused multiply add 有权

公开(公告)号：US12073214B2

公开(公告)日：2024-08-27

申请号：US17952001

申请日：2022-09-23

Applicant: Intel Corporation

Inventor： Jesus Corbal , Robert Valentine , Roman S. Dubtsov , Nikita A. Shustrov , Mark J. Charney , Dennis R. Bradford , Milind B. Girkar , Edward T. Grochowski , Thomas D. Fletcher , Warren E. Ferguson

IPC: G06F9/30 , G06F7/483 , G06F7/544 , G06F9/38

CPC classification number: G06F9/3001 , G06F7/483 , G06F7/5443 , G06F9/30036 , G06F9/30109 , G06F9/30112 , G06F9/3893

Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

4.

发明授权
Systems, apparatuses, and methods for chained fused multiply add 有权

公开(公告)号：US11487541B2

公开(公告)日：2022-11-01

申请号：US17107134

申请日：2020-11-30

Applicant: Intel Corporation

Inventor： Jesus Corbal , Robert Valentine , Roman S. Dubtsov , Nikita A. Shustrov , Mark J. Charney , Dennis R. Bradford , Milind B. Girkar , Edward T. Grochowski , Thomas D. Fletcher , Warren E. Ferguson

IPC: G06F9/30 , G06F7/544 , G06F7/483 , G06F9/38

Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

5.

发明授权
Systems, apparatuses, and methods for chained fused multiply add 有权

公开(公告)号：US10146535B2

公开(公告)日：2018-12-04

申请号：US15299420

申请日：2016-10-20

Applicant: Intel Corporation

Inventor： Jesus Corbal , Robert Valentine , Roman S. Dubtsov , Nikita A. Shustrov , Mark J. Charney , Dennis R. Bradford , Milind B. Girkar , Edward T. Grochowski , Thomas D. Fletcher , Warren E. Ferguson

IPC: G06F9/30 , G06F7/544

Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

6.

发明授权
Apparatus and method for performing a check to optimize instruction flow 有权

公开(公告)号：US09696992B2

公开(公告)日：2017-07-04

申请号：US14581815

申请日：2014-12-23

Applicant: INTEL CORPORATION

Inventor： Jesus Corbal San Adrian , Robert N. Hanek , Warren E. Ferguson , Taraneh Bahrami , Avi A. Tevet , Dennis R. Bradford , Michael Ferry , Jingwei Zhang

IPC: G06F7/38 , G06F9/30 , G06F9/38 , G06F9/455

CPC classification number: G06F9/3001 , G06F9/30014 , G06F9/30076 , G06F9/30145 , G06F9/30181 , G06F9/3861 , G06F9/4552 , G06F9/45525

Abstract: An apparatus and method for performing a check on inputs to a mathematical instruction and selecting a default sequence efficiently managing the architectural state of a processor. For example, one embodiment of a processor comprises: an arithmetic logic unit (ALU) to perform a plurality of mathematical operations using one or more source operands; instruction check logic to evaluate the source operands for a current mathematical instruction and to determine, based on the evaluation, whether to execute a default sequence of operations including executing the current mathematical instruction by the ALU or to jump to an alternate sequence of operations adapted to provide a result for the mathematical instruction having particular types of source operands more efficiently than the default sequence of operations.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification