Dynamic tile parallel neural network accelerator

    公开(公告)号:US11494627B1

    公开(公告)日:2022-11-08

    申请号:US17370171

    申请日:2021-07-08

    摘要: A dynamic-tile neural network accelerator allows for the number and size of computational tiles to be re-configured. Each sub-array of computational cells has edge cells on the left-most column that have an added vector mux that feeds the cell output back to an adder-comparator to allow Rectified Linear Unit (ReLU) and pooling operations that combine outputs shifted in from other cells. The edge cells drive external output registers and receive external weights. The weights and outputs are shifted in opposite directions horizontally between cells while control and input data are shifted in a same direction vertically between cells. A column of row data selectors is inserted between sub-arrays to bypass weights and output data around sub-arrays, while a row of column data selectors are inserted between sub-arrays to bypass control and input data. Larger tiles are configured by passing data directly through these selectors without bypassing.

    Cross-product detection method for a narrowband signal under a wide range of carrier frequency offset (CFO) using multiple frequency bins

    公开(公告)号:US10785074B1

    公开(公告)日:2020-09-22

    申请号:US16861370

    申请日:2020-04-29

    发明人: Jianhui Wang Tao Li

    摘要: A synchronizer generates cross-products of In-phase (I) and Quadrature (Q) samples and stores the sign bits for the sine and cosine cross-products. The sign bits are compared to a local reference of a frame-start bit-sequence and the compare results accumulated as I and Q correlations for symbol and half-symbol sampling. Linear combinations of the accumulated I and Q correlations for the symbol and half-symbol sampling generate linear combination results for frequency bins that peak at a different implied Carrier Frequency Offset (CFO) settings. The maximum of the linear combination results is selected and the implied CFO setting for that frequency bin is applied to a demodulator to adjust the receiver's CFO setting and bit synchronization. Computational complexity is reduced since only the sign bit of each cross-product is retained for correlation with the frame-start bit-sequence. Linear combinations can support a wide CFO range.