Patent search ap:("Arm Limited") AND inv:"Dibakar GOPE" Page 1

1.

发明公开
NEURAL PROCESSING UNIT FOR ATTENTION-BASED INFERENCE 审中-公开

公开(公告)号：US20240028877A1

公开(公告)日：2024-01-25

申请号：US17870038

申请日：2022-07-21

Applicant: Arm Limited

Inventor： Shounak DATTA , Dibakar GOPE , Jesse Garrett BEU , Mark John O'CONNOR

IPC: G06N3/063

CPC classification number: G06N3/063

Abstract: There is provided a neural processing unit for calculating an attention matrix during machine learning inference. The neural processing unit is configured to calculate: a first score matrix based on differences between a query matrix and a key matrix; a second score matrix based on differences between the key matrix and a learned key matrix; a similarity matrix based on a combination of the first score matrix and second score matrix; and an attention matrix comprising applying a normalisation function to the similarity matrix. Also provided is an apparatus comprising at least one said neural processing unit and at least one memory, the memory configured to pass, on demand, a learned key matrix to the neural processing unit. Also provided is a computer program product having computer readable program code stored thereon which, when executed by said neural processing unit, causes the unit to perform said calculations.

2.

发明申请
MIXED-ELEMENT-SIZE INSTRUCTION 有权

公开(公告)号：US20210389948A1

公开(公告)日：2021-12-16

申请号：US16897483

申请日：2020-06-10

Applicant: Arm Limited

Inventor： Jesse Garrett BEU , Dibakar GOPE , David Hennah MANSELL

IPC: G06F9/30

Abstract: A mixed-element-size instruction is described, which specifies a first operand and a second operand stored in registers. In response to the mixed-element-size instruction, an instruction decoder controls processing circuitry to perform an arithmetic/logical operation on two or more first data elements of the first operand and two or more second data elements of the second operand, where the first data elements have a larger data element size than the second data elements. This is particularly useful for machine learning applications to improve processing throughput and memory bandwidth utilisation.

Patent Agency Ranking