-
公开(公告)号:US11138292B1
公开(公告)日:2021-10-05
申请号:US16414703
申请日:2019-05-16
申请人: FACEBOOK, INC.
发明人: Krishnakumar Nair , Abdulkadir Utku Diril , Dheevatsa Mudigere , Ehsan Khish Ardestani Zadeh , Olivia Wu , Yuchen Hao
摘要: An electronic circuit performs depthwise convolution of an input matrix with a kernel matrix to generate an output matrix. In each of a plurality of rounds of operations, a row of kernel matrix elements is selected for the round of operations, and applied to the input matrix to obtain an intermediate data array corresponding to the selected row of kernel elements. The electronic circuit includes a plurality of subcircuits operable in parallel to generate, in each operation, a set of intermediate data elements in the intermediate data array. Each subcircuit generates a respective intermediate data element that is the sum of a respective row of the input matrix elements weighted by a set of weight elements including the selected row of kernel elements and at least one zero element. The selected row of kernel elements is successively shifted among the set of weight elements in the round of operations.
-
公开(公告)号:US20210049426A1
公开(公告)日:2021-02-18
申请号:US16543239
申请日:2019-08-16
申请人: Facebook, Inc.
摘要: A processor system comprises a memory organizer unit and a matrix computing unit. The memory organizer unit is configured to receive a request for a three-dimensional data of a convolutional neural network layer. The requested three-dimensional data is obtained from a memory. The obtained three-dimensional data is rearranged in an optimized linear order and the rearranged data in the optimized linear order is provided to the matrix computing unit. The matrix computing unit is configured to perform at least a portion of a three-dimensional convolution using at least a portion of the provided rearranged data in the optimized linear order.
-
公开(公告)号:US20210049229A1
公开(公告)日:2021-02-18
申请号:US16543241
申请日:2019-08-16
申请人: Facebook, Inc.
发明人: Krishnakumar Nair , Abdulkadir Utku Diril , Dheevatsa Mudigere , Olivia Wu , Ehsan Khish Ardestani Zadeh , Yuchen Hao
摘要: A system comprises a matrix processor unit that includes a first type of register, a group of a second type of registers, and a plurality of calculation units. The first type of register is configured to concurrently store values from different rows of a first matrix. At least a portion of the first type of register is logically divided into groups of elements, and each of the groups corresponds to a different row of the first matrix. Each of the second type of registers is configured to concurrently store values from a plurality of different rows of a second matrix. Each of the calculation units corresponds to one of the second type of registers and is configured to at least in part determine a corresponding element in a result matrix of convoluting the second matrix with the first matrix.
-
-