-
公开(公告)号:US20220300578A1
公开(公告)日:2022-09-22
申请号:US17834427
申请日:2022-06-07
Applicant: NVIDIA Corporation
Inventor: Piotr Majcher , Mostafa Hagog , Philippe Vandermersch
Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
-
公开(公告)号:US20210103433A1
公开(公告)日:2021-04-08
申请号:US16591306
申请日:2019-10-02
Applicant: Nvidia Corporation
Inventor: Andrew Kerr , Mike Murphy , Mostafa Hagog , Julien Demouth , John Tran
Abstract: Apparatuses, systems, and techniques are presented to compile code. In at least one embodiment, one or more compilers are to compile one or more compiled portions of code with one or more intermediate representations of one or more portions of code.
-
公开(公告)号:US12282526B2
公开(公告)日:2025-04-22
申请号:US18620228
申请日:2024-03-28
Applicant: NVIDIA Corporation
Inventor: Piotr Majcher , Mostafa Hagog , Philippe Vandermersch
Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
-
公开(公告)号:US20210406342A1
公开(公告)日:2021-12-30
申请号:US17471126
申请日:2021-09-09
Applicant: NVIDIA Corporation
Inventor: Piotr Majcher , Mostafa Hagog , Philippe Vandermersch
Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
-
公开(公告)号:US20240086491A1
公开(公告)日:2024-03-14
申请号:US18515062
申请日:2023-11-20
Applicant: NVIDIA Corporation
Inventor: Piotr Majcher , Mostafa Hagog , Philippe Vandermersch
CPC classification number: G06F17/16 , G06F9/3001 , G06F9/30145 , G06N3/08 , G06N5/046
Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
-
公开(公告)号:US20240256633A1
公开(公告)日:2024-08-01
申请号:US18620228
申请日:2024-03-28
Applicant: NVIDIA Corporation
Inventor: Piotr Majcher , Mostafa Hagog , Philippe Vandermersch
CPC classification number: G06F17/16 , G06F9/3001 , G06F9/30145 , G06N3/08 , G06N5/046
Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
-
公开(公告)号:US20220179703A1
公开(公告)日:2022-06-09
申请号:US17113993
申请日:2020-12-07
Applicant: NVIDIA Corporation
Inventor: Kevin Vincent , Yang Xu , Scott A. Yokim , Mostafa Hagog , Lingfeng Zhang , Seth Erickson Walters , Anerudhan Gopal
Abstract: Apparatuses, systems, and techniques to improve neural network computations. In at least one embodiment, a deep neural network library receives computation descriptors from one or more users and generates an optimized execution plan comprising one or more optimized operations to facilitate neural network computing.
-
公开(公告)号:US20210256092A1
公开(公告)日:2021-08-19
申请号:US16795380
申请日:2020-02-19
Applicant: NVIDIA Corporation
Inventor: Piotr Majcher , Mostafa Hagog , Philippe Vandermersch
Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
-
公开(公告)号:US20200334076A1
公开(公告)日:2020-10-22
申请号:US16389548
申请日:2019-04-19
Applicant: Nvidia Corporation
Inventor: Brian Fahs , Michael Lightstone , Mostafa Hagog
Abstract: An application binary interface (ABI) can be exposed in a processor to enable blocks of threads, which may correspond to separately compiled operators, to communicate without storing data to global memory external to the processor. The ABI can define how results of one computation, corresponding to a first thread block, will be organized in registers and shared memory of a processor at the end of one operator (i.e., kernel). The start of the next operator (i.e., kernel), corresponding to a second thread block, can consume the results from the registers and shared memory. Data can be stored to processor local storage for individual threads as they exit the block. Once published, libraries can be separately compiled, optimized, and tested as long as they adhere to the published ABI.
-
-
-
-
-
-
-
-