-
1.
公开(公告)号:US20240169018A1
公开(公告)日:2024-05-23
申请号:US17989448
申请日:2022-11-17
Applicant: QUALCOMM Incorporated
Inventor: Hamza OMAR , Engin IPEK , Bohuslav RYCHLIK , Luca MARONCELLI
CPC classification number: G06F17/16 , G06F7/5443
Abstract: An apparatus, including: a memory; a matrix multiplier engine, comprising: an array of multiplier-accumulate units (MAUs) comprising: a first set of accumulators; and a second set of accumulators; and a controller configured to concurrently: cause a first set of resultant values in the first set of accumulators to be transferred to the memory pursuant to a first set of store instructions, wherein the first set of resultant values was generated pursuant to a first set of multiply-accumulate (MAC) operations performed by the set of multipliers and the first set of accumulators; and cause the set of multipliers and the second set of accumulators to perform a second set of MAC operations.
-
公开(公告)号:US20240338555A1
公开(公告)日:2024-10-10
申请号:US18298029
申请日:2023-04-10
Applicant: QUALCOMM Incorporated
Inventor: Elina KAMENETSKAYA , Amir MOMENI , Hamza OMAR , Engin IPEK , Alexei Vladimirovich BOURD , Zifeng LI
IPC: G06N3/063
CPC classification number: G06N3/063
Abstract: Aspects of the disclosure are directed to concurrent tensor processing with multiple processing engines. In accordance with one aspect, an apparatus including a common memory unit; a first processing engine coupled to the common memory unit, wherein the first processing engine is configured to access a portion of an input tensor and a portion of a kernel tensor from the common memory unit; and a second processing engine coupled to the common memory unit, wherein the first processing engine is further configured to send the portion of the input tensor and the portion of the kernel tensor to the second processing engine and wherein the second processing engine is configured to generate a portion of an output tensor based on the portion of the input tensor and on the portion of the kernel tensor.
-