-
公开(公告)号:US20250060938A1
公开(公告)日:2025-02-20
申请号:US18449381
申请日:2023-08-14
Applicant: NVIDIA Corporation
Inventor: Jack CHOQUETTE , Po-An TSAI , Alexander L. MINKIN , Manan PATEL , Neal Clayton CRAGO , Daniel STIFFLER , Kefeng DUAN , Yu-Jung CHEN , Jing LI , Qian WANG , Ronny KRASHINSKY , Jun YANG , Feng XIE
Abstract: Systems and methods for efficient convolution based on matrix multiply and add (MMA) are described. An example processor having a plurality of processing lanes is configured to perform convolution of a matrix of activation elements and a filter matrix in accordance with a configurable series of instructions including a plurality of MMA instructions and shift instructions while reusing activation elements already loaded to the datapath or associated memory over a plurality of MMA operations. Associated methods are also described.