-
公开(公告)号:US20240362296A1
公开(公告)日:2024-10-31
申请号:US18308175
申请日:2023-04-27
发明人: Yaron Baruch SHAPIRO , Evgeny ROYZEN , Roi ELAD
CPC分类号: G06F17/16 , G06F12/0207 , G11C11/54
摘要: Techniques for reduced latency tensor transposition without using a redundant buffer are enabled. Reads from and writes to a buffer array may occur in different dimensions in a neural processing unit (NPU). For example, a set of tensor vectors may be written to a buffer in columnar format and read from the buffer in row format. As vectors of a first tensor are read from the buffer, incoming vectors from a second tensor may be transposed for storage in the dimension of already-read vectors without overwriting unread vectors. Write and read operations may alternately transpose vectors for continuous buffering in a single buffer with reduced latency.
-
公开(公告)号:US20230244921A1
公开(公告)日:2023-08-03
申请号:US17588657
申请日:2022-01-31
发明人: Evgeny ROYZEN , Evgeny ROGACHOV
CPC分类号: G06N3/0635 , G06J1/00
摘要: Power efficient performance may be implemented in a hardware accelerator (e.g., a neural processor) comprising hybrid or analog multiply and accumulate (MAC) processing elements (PEs). For example, power consumption may be reduced in neural networks with a rectified linear unit (ReLU) activation layer. A hybrid or analog MAC circuit may be configured with a look-ahead sign detector to dynamically stop computations prior to completion, for example, based on detection of a negative value, which a ReLU activation layer may (e.g., subsequently) convert to zero. The sign of a value may be indicated by a most significant bit (MSB). A controller may provide power and/or clock cycles to an analog to digital converter (ADC) to determine a sign of a value being computed. The sign may be used to selectively complete computations for positive values and selectively terminate computations for negative values, thereby reducing power consumption of the MAC circuit.
-