Patent search ap:("Intel Corporation") AND inv:"Sagi Meller" Page 1

1.

发明授权
Accelerator systems and methods for matrix operations 有权

公开(公告)号：US10942738B2

公开(公告)日：2021-03-09

申请号：US16368973

申请日：2019-03-29

Applicant: Intel Corporation

Inventor： Zeev Sperber , Amit Gradstein , Simon Rubanovich , Igor Yanover , Gavri Berger , Eyal Hadas , Saeed Kharouf , Ron Schneider , Sagi Meller , Jose Yallouz

IPC: G06F9/30 , G06F9/38

Abstract: The present disclosure is directed to systems and methods for performing one or more operations on a two dimensional tile register using an accelerator that includes a tiled matrix multiplication unit (TMU). The processor circuitry includes reservation station (RS) circuitry to communicatively couple the processor circuitry to the TMU. The RS circuitry coordinates the operations performed by the TMU. TMU dispatch queue (TDQ) circuitry in the TMU maintains the operations received from the RS circuitry in the order that the operations are received from the RS circuitry. Since the duration of each operation is not known prior to execution by the TMU, the RS circuitry maintains shadow dispatch queue (RS-TDQ) circuitry that mirrors the operations in the TDQ circuitry. Communication between the RS circuitry 134 and the TMU provides the RS circuitry with notification of successfully executed operations and allows the RS circuitry to cancel operations where the operations are associated with branch mispredictions and/or non-retired speculatively executed instructions.

2.

发明授权
Apparatuses, methods, and systems for instructions of a matrix operations accelerator 有权

公开(公告)号：US12204605B2

公开(公告)日：2025-01-21

申请号：US18360793

申请日：2023-07-27

Applicant: Intel Corporation

Inventor： Amit Gradstein , Simon Rubanovich , Sagi Meller , Saeed Kharouf , Gavri Berger , Zeev Sperber , Jose Yallouz , Ron Schneider

IPC: G06F17/16 , G06F9/30 , G06F9/38 , G06F9/34

Abstract: Systems, methods, and apparatuses relating to a matrix operations accelerator are described. In one embodiment, a processor includes a matrix operations accelerator circuit that includes a two-dimensional grid of fused multiply accumulate circuits that is switchable to a scheduling mode for execution of a decoded single instruction where the matrix operations accelerator circuit loads a first buffer of the two-dimensional grid of fused multiply accumulate circuits from a first plurality of registers that represents a first input two-dimensional matrix, checks if a second buffer of the two-dimensional grid of fused multiply accumulate circuits stores an immediately prior input two-dimension matrix that is the same as a second input two-dimensional matrix from a second plurality of registers that represents the first input two-dimensional matrix, and when the second buffer of the two-dimensional grid of fused multiply accumulate circuits stores the immediately prior input two-dimension matrix, from execution of a previous instruction, that is the same as the second input two-dimensional matrix: prevents reclamation of the second buffer between execution of the previous instruction and the decoded single instruction, performs an operation on the first input two-dimensional matrix from the first buffer and the immediately prior input two-dimension matrix from the second buffer to produce a resultant, and stores the resultant in resultant storage, and when the second buffer of the two-dimensional grid of fused multiply accumulate circuits does not store the immediately prior input two-dimension matrix, from execution of the previous instruction, that is the same as the second input two-dimensional matrix: loads the second input two-dimensional matrix into the second buffer of the two-dimensional grid of fused multiply accumulate circuits, performs the operation on the first input two-dimensional matrix from the first buffer and the second input two-dimension matrix from the second buffer to produce a resultant, and stores the resultant in the resultant storage.

3.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR TRANSPOSE INSTRUCTIONS OF A MATRIX OPERATIONS ACCELERATOR 审中-公开

公开(公告)号：US20200310803A1

公开(公告)日：2020-10-01

申请号：US16370894

申请日：2019-03-30

Applicant: Intel Corporation

Inventor： Amit Gradstein , Simon Rubanovich , Sagi Meller , Zeev Sperber , Jose Yallouz , Robert Valentine

IPC: G06F9/30 , G06F9/38

Abstract: Systems, methods, and apparatuses relating to a matrix operations accelerator are described. In one embodiment, a processor includes a matrix operations accelerator circuit that includes a two-dimensional grid of fused multiply accumulate circuits; a first plurality of registers that represents an input two-dimensional matrix coupled to the matrix operations accelerator circuit; a decoder, of a core coupled to the matrix operations accelerator circuit, to decode an instruction into a decoded instruction; and an execution circuit of the core to execute the decoded instruction to cause the two-dimensional grid of fused multiply accumulate circuits to form a transpose of the input two-dimensional matrix when the matrix operations accelerator circuit is in a transpose mode.

4.

发明授权
Apparatus and method for power virus protection in a processor 有权

公开(公告)号：US12099597B2

公开(公告)日：2024-09-24

申请号：US18384996

申请日：2023-10-30

Applicant: Intel Corporation

Inventor： Alexander Gendler , Sagi Meller , Gavri Berger , Igor Yanover

IPC: G06F21/00 , G06F1/32 , G06F9/38 , G06F21/54 , G06F21/56

CPC classification number: G06F21/54 , G06F1/32 , G06F9/3802 , G06F21/561 , G06F2221/034

Abstract: An apparatus and method for intelligent power virus protection in a processor. For example, one embodiment of a processor comprises: first circuitry including an instruction fetch circuit to fetch instructions, each instruction comprising an instruction type and an associated width comprising a number of bits associated with source and/or destination operand values associated with the instruction; detection circuitry to detect one or more instructions of a particular type and/or width; evaluation circuitry to evaluate an impact of power virus protection (PVP) circuitry when executing the one or more instructions based on the detected instruction types and/or widths; and control circuitry, based on the evaluation, to configure the PVP circuitry in accordance with the evaluation performed by the evaluation circuitry.

5.

发明授权
Apparatuses, methods, and systems for transpose instructions of a matrix operations accelerator 有权

公开(公告)号：US10990397B2

公开(公告)日：2021-04-27

申请号：US16370894

申请日：2019-03-30

Applicant: Intel Corporation

Inventor： Amit Gradstein , Simon Rubanovich , Sagi Meller , Zeev Sperber , Jose Yallouz , Robert Valentine

IPC: G06F9/30 , G06F9/38 , G06F17/16

Abstract: Systems, methods, and apparatuses relating to a matrix operations accelerator are described. In one embodiment, a processor includes a matrix operations accelerator circuit that includes a two-dimensional grid of fused multiply accumulate circuits; a first plurality of registers that represents an input two-dimensional matrix coupled to the matrix operations accelerator circuit; a decoder, of a core coupled to the matrix operations accelerator circuit, to decode an instruction into a decoded instruction; and an execution circuit of the core to execute the decoded instruction to cause the two-dimensional grid of fused multiply accumulate circuits to form a transpose of the input two-dimensional matrix when the matrix operations accelerator circuit is in a transpose mode.

6.

发明授权
Apparatus and method for power virus protection in a processor 有权

公开(公告)号：US11809549B2

公开(公告)日：2023-11-07

申请号：US16728843

申请日：2019-12-27

Applicant: Intel Corporation

Inventor： Alexander Gendler , Sagi Meller , Gavri Berger , Igor Yanover

IPC: G06F21/00 , G06F21/54 , G06F1/32 , G06F21/56 , G06F9/38

CPC classification number: G06F21/54 , G06F1/32 , G06F9/3802 , G06F21/561 , G06F2221/034

Abstract: An apparatus and method for intelligent power virus protection in a processor. For example, one embodiment of a processor comprises: first circuitry including an instruction fetch circuit to fetch instructions, each instruction comprising an instruction type and an associated width comprising a number of bits associated with source and/or destination operand values associated with the instruction; detection circuitry to detect one or more instructions of a particular type and/or width; evaluation circuitry to evaluate an impact of power virus protection (PVP) circuitry when executing the one or more instructions based on the detected instruction types and/or widths; and control circuitry, based on the evaluation, to configure the PVP circuitry in accordance with the evaluation performed by the evaluation circuitry.

7.

发明授权
Apparatuses, methods, and systems for instructions of a matrix operations accelerator 有权

公开(公告)号：US11714875B2

公开(公告)日：2023-08-01

申请号：US16729361

申请日：2019-12-28

Applicant: Intel Corporation

Inventor： Amit Gradstein , Simon Rubanovich , Sagi Meller , Saeed Kharouf , Gavri Berger , Zeev Sperber , Jose Yallouz , Ron Schneider

IPC: G06F17/16 , G06F9/30 , G06F9/38 , G06F9/34

CPC classification number: G06F17/16 , G06F9/3001 , G06F9/30036 , G06F9/3851 , G06F9/34

Abstract: Systems, methods, and apparatuses relating to a matrix operations accelerator are described. In one embodiment, a processor includes a matrix operations accelerator circuit that includes a two-dimensional grid of fused multiply accumulate circuits that is switchable to a scheduling mode for execution of a decoded single instruction where the matrix operations accelerator circuit loads a first buffer of the two-dimensional grid of fused multiply accumulate circuits from a first plurality of registers that represents a first input two-dimensional matrix, checks if a second buffer of the two-dimensional grid of fused multiply accumulate circuits stores an immediately prior input two-dimension matrix that is the same as a second input two-dimensional matrix from a second plurality of registers that represents the first input two-dimensional matrix, and when the second buffer of the two-dimensional grid of fused multiply accumulate circuits stores the immediately prior input two-dimension matrix, from execution of a previous instruction, that is the same as the second input two-dimensional matrix: prevents reclamation of the second buffer between execution of the previous instruction and the decoded single instruction, performs an operation on the first input two-dimensional matrix from the first buffer and the immediately prior input two-dimension matrix from the second buffer to produce a resultant, and stores the resultant in resultant storage, and when the second buffer of the two-dimensional grid of fused multiply accumulate circuits does not store the immediately prior input two-dimension matrix, from execution of the previous instruction, that is the same as the second input two-dimensional matrix: loads the second input two-dimensional matrix into the second buffer of the two-dimensional grid of fused multiply accumulate circuits, performs the operation on the first input two-dimensional matrix from the first buffer and the second input two-dimension matrix from the second buffer to produce a resultant, and stores the resultant in the resultant storage.

8.

发明申请
ACCELERATOR SYSTEMS AND METHODS FOR MATRIX OPERATIONS 审中-公开

公开(公告)号：US20200310794A1

公开(公告)日：2020-10-01

申请号：US16368973

申请日：2019-03-29

Applicant: Intel Corporation

Inventor： ZEEV SPERBER , Amit Gradstein , Simon Rubanovich , Igor Yanover , Gavri Berger , Eyal Hadas , Saeed Kharouf , Ron Schneider , Sagi Meller , Jose Yallouz

IPC: G06F9/30 , G06F9/38

Abstract: The present disclosure is directed to systems and methods for performing one or more operations on a two dimensional tile register using an accelerator that includes a tiled matrix multiplication unit (TMU). The processor circuitry includes reservation station (RS) circuitry to communicatively couple the processor circuitry to the TMU. The RS circuitry coordinates the operations performed by the TMU. TMU dispatch queue (TDQ) circuitry in the TMU maintains the operations received from the RS circuitry in the order that the operations are received from the RS circuitry. Since the duration of each operation is not known prior to execution by the TMU, the RS circuitry maintains shadow dispatch queue (RS-TDQ) circuitry that mirrors the operations in the TDQ circuitry. Communication between the RS circuitry 134 and the TMU provides the RS circuitry with notification of successfully executed operations and allows the RS circuitry to cancel operations where the operations are associated with branch mispredictions and/or non-retired speculatively executed instructions.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification