Patent search ap:("ADVANCED MICRO DEVICES Page INC.") AND inv:"Prerit Dak"

1.

发明授权
Multi-accelerator compute dispatch 有权

公开(公告)号：US11790590B2

公开(公告)日：2023-10-17

申请号：US17218421

申请日：2021-03-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Milind N. Nemlekar , Maxim V. Kazakov , Prerit Dak

IPC: G06T15/00 , G06F9/54 , G06T15/80

CPC classification number: G06T15/005 , G06F9/545 , G06T15/80

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

2.

发明申请
MULTI-ACCELERATOR COMPUTE DISPATCH 有权

公开(公告)号：US20220319089A1

公开(公告)日：2022-10-06

申请号：US17218421

申请日：2021-03-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Milind N. Nemlekar , Maxim V. Kazakov , Prerit Dak

IPC: G06T15/00 , G06T15/80 , G06F9/54

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

3.

发明公开
MULTI-ACCELERATOR COMPUTE DISPATCH 审中-公开

公开(公告)号：US20240029336A1

公开(公告)日：2024-01-25

申请号：US18480466

申请日：2023-10-03

Applicant: Advanced Micro Devices, Inc.

Inventor： Milind N. Nemlekar , Maxim V. Kazakov , Prerit Dak

IPC: G06T15/00 , G06F9/54 , G06T15/80

CPC classification number: G06T15/005 , G06F9/545 , G06T15/80

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

4.

发明授权
Multi-accelerator compute dispatch 有权

公开(公告)号：US12165252B2

公开(公告)日：2024-12-10

申请号：US18480466

申请日：2023-10-03

Applicant: Advanced Micro Devices, Inc.

Inventor： Milind N. Nemlekar , Maxim V. Kazakov , Prerit Dak

IPC: G06T15/00 , G06F9/54 , G06T15/80

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

5.

发明授权
Fused convolution and batch normalization for neural networks 有权

公开(公告)号：US11573765B2

公开(公告)日：2023-02-07

申请号：US16219154

申请日：2018-12-13

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Milind N. Nemlekar , Prerit Dak

IPC: G06N3/04 , G06F5/01 , G06F17/16 , G06N3/08

Abstract: A processing unit implements a convolutional neural network (CNN) by fusing at least a portion of a convolution phase of the CNN with at least a portion of a batch normalization phase. The processing unit convolves two input matrices representing inputs and weights of a portion of the CNN to generate an output matrix. The processing unit performs the convolution via a series of multiplication operations, with each multiplication operation generating a corresponding submatrix (or “tile”) of the output matrix at an output register of the processing unit. While an output submatrix is stored at the output register, the processing unit performs a reduction phase and an update phase of the batch normalization phase for the CNN. The processing unit thus fuses at least a portion of the batch normalization phase of the CNN with a portion of the convolution.

Patent Agency Ranking