-
公开(公告)号:US20230140934A1
公开(公告)日:2023-05-11
申请号:US17689660
申请日:2022-03-08
Applicant: NVIDIA Corporation
Inventor: Chao Li , Jing Li , Alan Kaatz , Ronny Meir Krashinsky , Albert Xu
CPC classification number: G06F9/544 , G06F9/52 , G06F9/4843
Abstract: Apparatuses, systems, and techniques to perform a matrix multiplication using parallel processing. In at least one embodiment, a matrix multiplication is divided into a set of tiles, with each tile processed with a prolog task, a calculation task, and an epilog task. The prolog tasks are performed by a dedicated set of threads, with the remaining tasks performed in an interleaved manner using two or more thread groups.