-
公开(公告)号:US11669933B2
公开(公告)日:2023-06-06
申请号:US17730364
申请日:2022-04-27
Applicant: Intel Corporation
Inventor: Naveen K. Mellempudi , Dheevatsa Mudigere , Dipankar Das , Srinivas Sridharan
IPC: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/544 , G06F17/15 , G06F17/16 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045
CPC classification number: G06T1/20 , G06F5/01 , G06F7/501 , G06F7/523 , G06F7/5443 , G06F17/153 , G06F17/16 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F2207/382 , G06F2207/4824
Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to quantize elements of a floating-point tensor to convert the floating-point tensor into a dynamic fixed-point tensor.
-
公开(公告)号:US11556772B2
公开(公告)日:2023-01-17
申请号:US15869515
申请日:2018-01-12
Applicant: Intel Corporation
Inventor: Abhisek Kundu , Naveen Mellempudi , Dheevatsa Mudigere , Dipankar Das
IPC: G06N3/08 , G06N5/04 , G06N3/04 , G06T15/00 , G06F9/46 , G06N3/063 , G06T17/20 , G06T15/80 , G06T17/10 , G06T15/04 , G06V10/94
Abstract: One embodiment provides for a computing device comprising a parallel processor compute unit to perform a set of parallel integer compute operations; a ternarization unit including a weight ternarization circuit and an activation quantization circuit; wherein the weight ternarization circuit is to convert a weight tensor from a floating-point representation to a ternary representation including a ternary weight and a scale factor; wherein the activation quantization circuit is to convert an activation tensor from a floating-point representation to an integer representation; and wherein the parallel processor compute unit includes one or more circuits to perform the set of parallel integer compute operations on the ternary representation of the weight tensor and the integer representation of the activation tensor.
-
公开(公告)号:US20210342692A1
公开(公告)日:2021-11-04
申请号:US17321044
申请日:2021-05-14
Applicant: Intel Corporation
Inventor: Naveen K. Mellempudi , Srinivas Sridharan , Dheevatsa Mudigere , Dipankar Das
Abstract: Technologies for artificial neural network training include a computing node with a host fabric interface that sends a message that includes one or more artificial neural network training algorithm values to another computing node in response to receipt of a request to send the message. Prior to sending the message, the host fabric interface may receive a request to quantize the message and quantize the message based on a quantization level included in the request to generate a quantized message. The quantization message includes one or more quantized values such that each quantized value has a lower precision than a corresponding artificial neural network training algorithm value. The host fabric interface then transmits the quantized message, which includes metadata indicative of the quantization level, to another computing node in response to quantization of the message for artificial neural network training. Other embodiments are described and claimed.
-
公开(公告)号:US10825127B2
公开(公告)日:2020-11-03
申请号:US16853405
申请日:2020-04-20
Applicant: Intel Corporation
Inventor: Naveen Mellempudi , Dheevatsa Mudigere , Dipankar Das , Srinivas Sridharan
IPC: G06T1/20 , G06N3/08 , G06N3/04 , G06F7/544 , G06F17/15 , G06F5/01 , G06F7/523 , G06F17/16 , G06N3/063 , G06F7/501
Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic, the compute unit to receive a set of dynamic fixed-point tensors, compute, via the dynamic precision fixed-point logic, a right-shift value using an absolute maximum value within the set of dynamic fixed-point tensors and a dynamic range of the set of dynamic fixed-point tensors, right-shift data values within the set of dynamic fixed-point tensors based on the right-shift value, increment a shared exponent associated with the set of dynamic fixed-point tensors based on the right-shift value, perform a compute operation on the set of dynamic fixed-point tensors, and generate an output tensor via the compute operation on the set of dynamic fixed-point tensors.
-
-
-