Patent search ap:("QUALCOMM INCORPORATED") AND inv:"KUZMIN Page Andrey"

1.

发明申请
NEURAL NETWORK PRUNING WITH CYCLICAL SPARSITY 审中-公开

公开(公告)号：WO2022169497A1

公开(公告)日：2022-08-11

申请号：PCT/US2021/060710

申请日：2021-11-24

Applicant: QUALCOMM INCORPORATED

Inventor： SRINIVAS, Suraj , BLANKEVOORT, Tijmen Pieter Frederik , KUZMIN, Andrey , NAGEL, Markus , VAN BAALEN, Marinus Willem , SKLIAR, Andrii

IPC: G06N3/08 , G06N3/04

Abstract: Various embodiments include methods and devices for neural network pruning. Embodiments may include receiving as an input a weight tensor for a neural network, increasing a level of sparsity of the weight tensor generating a sparse weight tensor, updating the neural network using the sparse weight tensor generating an updated weight tensor, decreasing a level of sparsity of the updated weight tensor generating a dense weight tensor, increasing the level of sparsity of the dense weight tensor the dense weight tensor generating a final sparse weight tensor, and using the neural network with the final sparse weight tensor to generate inferences. Some embodiments may include increasing a level of sparsity of a first sparse weight tensor generating a second sparse weight tensor, updating the neural network using the second sparse weight tensor generating a second updated weight tensor, and decreasing the level of sparsity the second updated weight tensor.

2.

发明申请
MODEL COMPRESSION VIA QUANTIZED SPARSE PRINCIPAL COMPONENT ANALYSIS 审中-公开

公开(公告)号：WO2023059723A1

公开(公告)日：2023-04-13

申请号：PCT/US2022/045785

申请日：2022-10-05

Applicant: QUALCOMM INCORPORATED

Inventor： KUZMIN, Andrey , VAN BAALEN, Marinus Willem , NAGEL, Markus , BEHBOODI, Arash

IPC: G06N3/04 , G06N3/08 , G06N3/045 , G06N3/084

Abstract: A processor-implemented method includes retrieving, for a layer of a set of layers of an artificial neural network (ANN), a dense quantized matrix representing a codebook and a sparse quantized matrix representing linear coefficients. The dense quantized matrix and the sparse quantized matrix may be associated with a weight tensor of the layer. The processor-implemented method also includes determining, for the layer of the set of layers, the weight tensor based on a product of the dense quantized matrix and the sparse quantized matrix. The processor-implemented method further includes processing, at the layer, an input based on the weight tensor.

Patent Agency Ranking