Invention Application
- Patent Title: MIXING SPARSITY COMPRESSION
-
Application No.: US17449576Application Date: 2021-09-30
-
Publication No.: US20230100930A1Publication Date: 2023-03-30
- Inventor: Xiaodan Tan , Paul Gilbert Meyer , Gennady Pekhimenko , Randy Renfu Huang
- Applicant: Amazon Technologies, Inc.
- Applicant Address: US WA Seattle
- Assignee: Amazon Technologies, Inc.
- Current Assignee: Amazon Technologies, Inc.
- Current Assignee Address: US WA Seattle
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N3/04

Abstract:
Techniques for compressing a neural network model by mixing compression ratios (sparsity patterns) are described. The weight tensor of a neural network model is divided into weight groups. The pruning cost of compressing the weight values according to a compression ratio is determined for each weight group, and a pruning cost distribution for the compression ratio is generated from the pruning costs of the weight groups. A cost threshold can then be selected from the pruning cost distribution, and weight groups having a pruning cost below the selected cost threshold are compressed according to the compression ratio. The remaining weight groups can be compressed using one or more less aggressive compression ratios. The cost threshold can be adjusted to tune the overall sparsity and accuracy of the compressed neural network.
Information query