Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Fei Wang"

1.

发明申请
AUTO GENERATION AND TUNING TOOL FOR CONVOLUTION KERNELS 审中-公开

公开(公告)号：US20200302285A1

公开(公告)日：2020-09-24

申请号：US16367093

申请日：2019-03-27

Applicant: Advanced Micro Devices, Inc.

Inventor： Fei Wang , Jian Yang

IPC: G06N3/08 , G06N20/10 , G06T5/20 , G06T5/50

Abstract: Systems, apparatuses, and methods for implementing an auto generation and tuning tool for convolution kernels are disclosed. A processor executes multiple tuning runs of a given layer of a neural network while using a different set of operating parameter values for each tuning run. The operating parameters can include one or more of input dataset fetch group size, output channel group size, and other parameters. The processor captures performance data for each tuning run and then after all tuning runs have finished, the processor determines which set of operating parameter values resulted in a better performance for the given neural network layer. The processor uses these operating parameter values for subsequent iterations of the given layer. The processor also performs the same techniques for other layers to determine which set of operating parameter values to use for each layer so as to maximize performance of the neural network.

2.

发明授权
Auto generation and tuning tool for convolution kernels 有权

公开(公告)号：US11983624B2

公开(公告)日：2024-05-14

申请号：US16367093

申请日：2019-03-27

Applicant: Advanced Micro Devices, Inc.

Inventor： Fei Wang , Jian Yang

IPC: G06T5/50 , G06N3/08 , G06N20/10 , G06T5/20

CPC classification number: G06N3/08 , G06N20/10 , G06T5/20 , G06T5/50 , G06T2207/20084

Abstract: Systems, apparatuses, and methods for implementing an auto generation and tuning tool for convolution kernels are disclosed. A processor executes multiple tuning runs of a given layer of a neural network while using a different set of operating parameter values for each tuning run. The operating parameters can include one or more of input dataset fetch group size, output channel group size, and other parameters. The processor captures performance data for each tuning run and then after all tuning runs have finished, the processor determines which set of operating parameter values resulted in a better performance for the given neural network layer. The processor uses these operating parameter values for subsequent iterations of the given layer. The processor also performs the same techniques for other layers to determine which set of operating parameter values to use for each layer so as to maximize performance of the neural network.

Patent Agency Ranking