Patent search ap:("QUALCOMM Incorporated") AND inv:"Andrey KUZMIN" Page 1

1.

发明公开
SIMULATED LOW BIT-WIDTH QUANTIZATION USING BIT SHIFTED NEURAL NETWORK PARAMETERS 审中-公开

公开(公告)号：US20230306233A1

公开(公告)日：2023-09-28

申请号：US18103428

申请日：2023-01-30

Applicant: QUALCOMM Incorporated

Inventor： Marinus Willem VAN BAALEN , Brian KAHNE , Eric Wayne MAHURIN , Tijmen Pieter Frederik BLANKEVOORT , Andrey KUZMIN , Andrii SKLIAR , Markus NAGEL

IPC: G06N3/04

CPC classification number: G06N3/04

Abstract: A processor-implemented method includes bit shifting a binary representation of a neural network parameter. The neural network parameter has fewer bits, b, than a number of hardware bits, B, supported by hardware that processes the neural network parameter. The bit shifting effectively multiplies the neural network parameter by 2B-b. The method also includes dividing a quantization scale by 2B-b to obtain an updated quantization scale. The method further includes quantizing the bit shifted binary representation with the updated quantization scale to obtain a value for the neural network parameter.

2.

发明公开
FAST EIGHT-BIT FLOATING POINT (FP8) SIMULATION WITH LEARNABLE PARAMETERS 审中-公开

公开(公告)号：US20230376272A1

公开(公告)日：2023-11-23

申请号：US18102582

申请日：2023-01-27

Applicant: QUALCOMM Incorporated

Inventor： Marinus Willem VAN BAALEN , Jorn Wilhelmus Timotheus PETERS , Markus NAGEL , Tijmen Pieter Frederik BLANKEVOORT , Andrey KUZMIN

IPC: G06F7/483 , G06F5/01

CPC classification number: G06F7/483 , G06F5/012

Abstract: A processor-implemented method for fast floating point simulations with learnable parameters includes receiving a single precision input. An integer quantization process is performed on the input. Each element of the input is scaled based on a scaling parameter to generate an m-bit floating point output, where m is an integer.

3.

发明申请
Neural Network Pruning With Cyclical Sparsity 有权

公开(公告)号：US20220245457A1

公开(公告)日：2022-08-04

申请号：US17456318

申请日：2021-11-23

Applicant: QUALCOMM Incorporated

Inventor： Suraj SRINIVAS , Tijmen Pieter Frederik BLANKEVOORT , Andrey KUZMIN , Markus NAGEL , Marinus Willem VAN BAALEN , Andrii SKLIAR

IPC: G06N3/08 , G06K9/62

Abstract: Various embodiments include methods and devices for neural network pruning. Embodiments may include receiving as an input a weight tensor for a neural network, increasing a level of sparsity of the weight tensor generating a sparse weight tensor, updating the neural network using the sparse weight tensor generating an updated weight tensor, decreasing a level of sparsity of the updated weight tensor generating a dense weight tensor, increasing the level of sparsity of the dense weight tensor the dense weight tensor generating a final sparse weight tensor, and using the neural network with the final sparse weight tensor to generate inferences. Some embodiments may include increasing a level of sparsity of a first sparse weight tensor generating a second sparse weight tensor, updating the neural network using the second sparse weight tensor generating a second updated weight tensor, and decreasing the level of sparsity the second updated weight tensor.

Patent Agency Ranking