-
公开(公告)号:US10997492B2
公开(公告)日:2021-05-04
申请号:US15838273
申请日:2017-12-11
Applicant: NVIDIA Corporation
Inventor: Szymon Migacz , Hao Wu , Dilip Sequeira , Ujval Kapasi , Maxim Milakov , Slawomir Kierat , Zacky Zhou , Yilin Zhang , Alex Fit-Florea
Abstract: Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format. Embodiments of the present invention generate candidate conversions of data output, then employ a relative measure of quality to identify the candidate conversion with the greatest accuracy (i.e., least divergence from the original higher precision values). The representation can be then be used during inference to perform computations on the resulting output data.
-
公开(公告)号:US20180211152A1
公开(公告)日:2018-07-26
申请号:US15838273
申请日:2017-12-11
Applicant: NVIDIA Corporation
Inventor: Szymon Migacz , Hao Wu , Dilip Sequeira , Ujval Kapasi , Maxim Milakov , Slawomir Kierat , Zacky Zhou , Yilin Zhang , Alex Fit-Florea
CPC classification number: G06N3/04 , G06N3/0454 , G06N3/08 , G06N7/00
Abstract: Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format. Embodiments of the present invention generate candidate conversions of data output, then employ a relative measure of quality to identify the candidate conversion with the greatest accuracy (i.e., least divergence from the original higher precision values). The representation can be then be used during inference to perform computations on the resulting output data.
-
公开(公告)号:US20240119267A1
公开(公告)日:2024-04-11
申请号:US17950009
申请日:2022-09-21
Applicant: NVIDIA Corporation
Inventor: Slawomir Kierat , Piotr Karpinski , Mateusz Sieniawski , Pawel Morkisz , Szymon Migacz , Linnan Wang , Chen-Han Yu , Satish Salian , Ashwath Aithal , Alexandru Fit-Florea
CPC classification number: G06N3/0481 , G06N3/08
Abstract: Apparatuses, systems, and techniques to selectively use one or more neural network layers. In at least one embodiment, one or more neural network layers are selectively used based on, for example, one or more iteratively increasing neural network performance metrics.
-
公开(公告)号:US20210256348A1
公开(公告)日:2021-08-19
申请号:US17306171
申请日:2021-05-03
Applicant: NVIDIA Corporation
Inventor: Szymon Migacz , Hao Wu , Dilip Sequeira , Ujval Kapasi , Maxim Milakov , Slawomir Kierat , Zacky Zhou , Yilin Zhang , Alex Fit-Florea
Abstract: Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format. Embodiments of the present invention generate candidate conversions of data output, then employ a relative measure of quality to identify the candidate conversion with the greatest accuracy (i.e., least divergence from the original higher precision values). The representation can be then be used during inference to perform computations on the resulting output data.
-
-
-