-
公开(公告)号:US11861452B1
公开(公告)日:2024-01-02
申请号:US16443634
申请日:2019-06-17
Applicant: Cadence Design Systems, Inc.
Inventor: Ming Kai Hsu
Abstract: Quantized softmax layers in neural networks are described. Some embodiments involve receiving, at an input to a softmax layer of a neural network from an intermediate layer of the neural network, a non-normalized output comprising a plurality of intermediate network decision values. Then for each intermediate network decision value of the plurality of intermediate network decision values, the embodiment involves: calculating a difference between the intermediate network decision value and a maximum network decision value; requesting, from a lookup table, a corresponding lookup table value using the difference between the intermediate network decision value and the maximum network decision value; and selecting the corresponding lookup table value as a corresponding decision value. A normalized output is then generated comprising the corresponding lookup table value for said each intermediate network decision value of the plurality of intermediate network decision values.
-
公开(公告)号:US11630982B1
公开(公告)日:2023-04-18
申请号:US16131402
申请日:2018-09-14
Applicant: Cadence Design Systems, Inc.
Inventor: Ming Kai Hsu , Sandip Parikh
Abstract: Aspects of the present disclosure address systems and methods for fixed-point quantization using a dynamic quantization level adjustment scheme. Consistent with some embodiments, a method comprises accessing a neural network comprising floating-point representations of filter weights corresponding to one or more convolution layers. The method further includes determining a peak value of interest from the filter weights and determining a quantization level for the filter weights based on a number of bits in a quantization scheme. The method further includes dynamically adjusting the quantization level based on one or more constraints. The method further includes determining a quantization scale of the filter weights based on the peak value of interest and the adjusted quantization level. The method further includes quantizing the floating-point representations of the filter weights using the quantization scale to generate fixed-point representations of the filter weights.
-
公开(公告)号:US11861492B1
公开(公告)日:2024-01-02
申请号:US16727629
申请日:2019-12-26
Applicant: Cadence Design Systems, Inc.
Inventor: Ming Kai Hsu
Abstract: Various embodiments provide for quantizing a trained neural network with removal of normalization with respect to at least one layer of the quantized neural network, such as a quantized multiple fan-in layer (e.g., element-wise add or sum layer).
-
-