Patent search ap:("NVIDIA Corporation") AND inv:"Po-An Tsai" Page 1

1.

发明申请
COMPRESSION OF MACHINE LEARNING MODELS VIA SPARSIFICATION AND QUANTIZATION 有权

公开(公告)号：US20250094864A1

公开(公告)日：2025-03-20

申请号：US18602951

申请日：2024-03-12

Applicant: NVIDIA Corporation

Inventor： Po-An Tsai , Geonhwa Jeong , Jeffrey Michael Pool

IPC: G06N20/00

Abstract: Machine learning is a process that learns a model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the size, computation, and latency of a machine learning model, a compression technique can be employed which includes model sparsification and quantization. To limit the extent to which the quality of the model is impacted when uniformly applying sparsification and quantization to all values of the model, the present disclosure provides for a hybrid sparsification and quantization of the model.

2.

发明申请
PRUNING AND ACCELERATING NEURAL NETWORKS WITH HIERARCHICAL FINE-GRAINED STRUCTURED SPARSITY 有权

公开(公告)号：US20230062503A1

公开(公告)日：2023-03-02

申请号：US17681967

申请日：2022-02-28

Applicant: NVIDIA Corporation

Inventor： Yannan Wu , Po-An Tsai , Saurav Muralidharan , Joel Springer Emer

IPC: G06N3/08

Abstract: Hierarchical structured sparse parameter pruning and processing improves runtime performance and energy efficiency of neural networks. In contrast with conventional (non-structured) pruning which allows for any distribution of the non-zero values within a matrix that achieves the desired sparsity degree (e.g., 50%) and is consequently difficult to accelerate, structured hierarchical sparsity requires each multi-element unit at the coarsest granularity of the hierarchy to be pruned to the desired sparsity degree. The global desired sparsity degree is a function of the per-level sparsity degrees. Distribution of non-zero values within each multi-element unit is constrained according to the per-level sparsity degree at the particular level of the hierarchy. Each level of the hierarchy may be associated with a hardware (e.g., logic or circuit) structure that can be enabled or disabled according to the per-level sparsity. Hierarchical sparsity provides performance improvements for a greater variety of sparsity patterns, granularity, and sparsity degrees.

3.

发明公开
GENERATING SPARSE NEURAL NETWORKS 审中-公开

公开(公告)号：US20240152407A1

公开(公告)日：2024-05-09

申请号：US18222916

申请日：2023-07-17

Applicant: NVIDIA Corporation

Inventor： Geonhwa Jeong , Po-An Tsai , Jeffrey Michael Pool

IPC: G06F9/50 , G06F7/544

CPC classification number: G06F9/5083 , G06F7/5443

Abstract: Apparatuses, systems, and techniques to determine a configuration based at least in part on data stored by at least one data structure of a workload at runtime, and transform the workload into a sparse workload based at least in part on the configuration. In at least one embodiment, one or more sparse workloads (e.g., one or more sparse neural networks) are generated based at least in part on, for example, one or more workloads (e.g., one or more neural networks).

Patent Agency Ranking