-
1.
公开(公告)号:US12008069B1
公开(公告)日:2024-06-11
申请号:US18523627
申请日:2023-11-29
Applicant: Recogni Inc.
Inventor: Jian hui Huang , Gary S. Goldman
CPC classification number: G06F17/16 , G06F7/5443
Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.
-
2.
公开(公告)号:US12045309B1
公开(公告)日:2024-07-23
申请号:US18523615
申请日:2023-11-29
Applicant: Recogni Inc.
Inventor: Jian hui Huang , Gary S. Goldman
CPC classification number: G06F17/16 , G06F7/5443
Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.
-
公开(公告)号:US12039290B1
公开(公告)日:2024-07-16
申请号:US18408296
申请日:2024-01-09
Applicant: Recogni Inc.
Inventor: Jian hui Huang , Gary S. Goldman
CPC classification number: G06F7/5443 , G06F7/5095
Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.
-
公开(公告)号:US12026478B1
公开(公告)日:2024-07-02
申请号:US18408309
申请日:2024-01-09
Applicant: Recogni Inc.
Inventor: Jian hui Huang , Gary S. Goldman
IPC: G06F7/52
CPC classification number: G06F7/52
Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.
-
5.
公开(公告)号:US12165041B2
公开(公告)日:2024-12-10
申请号:US17806143
申请日:2022-06-09
Applicant: Recogni Inc.
Inventor: Shabarivas Abhiram , Gary S. Goldman , Jian hui Huang , Eugene M. Feinberg
Abstract: In a low power hardware architecture for handling accumulation overflows in a convolver unit, an accumulator of the convolver unit computes a running total by successively summing dot products from a dot product computation module during an accumulation cycle. In response to the running total overflowing the maximum or minimum value of a data storage element, the accumulator transmits an overflow indicator to a controller and sets its output equal to a positive or negative overflow value. In turn, the controller disables the dot product computation module by clock gating, clamping one of its inputs to zero and/or holding its inputs to constant values. At the end of the accumulation cycle, the output of the accumulator is sampled. In response to a clear signal being asserted, the dot product computation module is enabled, and the running total is set to zero for the start of the next accumulation cycle.
-
公开(公告)号:US20220076104A1
公开(公告)日:2022-03-10
申请号:US16948164
申请日:2020-09-04
Applicant: Recogni Inc.
Inventor: Jian hui Huang , James Michael Bodwin , Pradeep R. Joginipally , Shabarivas Abhiram , Gary S. Goldman , Martin Stefan Patz , Eugene M. Feinberg , Berend Ozceri
Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n
-
公开(公告)号:US12141685B2
公开(公告)日:2024-11-12
申请号:US18410736
申请日:2024-01-11
Applicant: Recogni Inc.
Inventor: Jian hui Huang , James Michael Bodwin , Pradeep R. Joginipally , Shabarivas Abhiram , Gary S. Goldman , Martin Stefan Patz , Eugene M. Feinberg , Berend Ozceri
Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n
-
公开(公告)号:US11915126B2
公开(公告)日:2024-02-27
申请号:US16948164
申请日:2020-09-04
Applicant: Recogni Inc.
Inventor: Jian hui Huang , James Michael Bodwin , Pradeep R. Joginipally , Shabarivas Abhiram , Gary S. Goldman , Martin Stefan Patz , Eugene M. Feinberg , Berend Ozceri
CPC classification number: G06N3/063 , G06F7/50 , G06F7/5443 , G06N3/0464
Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n
-
9.
公开(公告)号:US12007937B1
公开(公告)日:2024-06-11
申请号:US18523632
申请日:2023-11-29
Applicant: Recogni Inc.
Inventor: Jian hui Huang , Gary S. Goldman
Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.
-
公开(公告)号:US20240143988A1
公开(公告)日:2024-05-02
申请号:US18410736
申请日:2024-01-11
Applicant: Recogni Inc.
Inventor: Jian hui Huang , James Michael Bodwin , Pradeep R. Joginipally , Shabarivas Abhiram , Gary S. Goldman , Martin Stefan Patz , Eugene M. Feinberg , Berend Ozceri
IPC: G06N3/063 , G06F7/50 , G06F7/544 , G06N3/0464
CPC classification number: G06N3/063 , G06F7/50 , G06F7/5443 , G06N3/0464
Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n
-
-
-
-
-
-
-
-
-