-
公开(公告)号:US20210256363A1
公开(公告)日:2021-08-19
申请号:US16793961
申请日:2020-02-18
Applicant: Facebook, Inc.
Inventor: Krishnakumar Narayanan Nair , Rakesh Komuravelli , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian
Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.
-
公开(公告)号:US11010202B2
公开(公告)日:2021-05-18
申请号:US16533588
申请日:2019-08-06
Applicant: Facebook, Inc.
Inventor: Martin Schatz , Amin Firoozshahian
IPC: G06F9/50
Abstract: A specification of an operation to perform one or more element-wise sums of specified portions of a matrix is received. The specification of the operation is analyzed to select a type of processing load partitioning to be applied. Based on the selected type of processing load partitioning to be applied, processing required to perform the operation is partitioned across a plurality of physical processing elements in parallel. The partitioned processing is distributed to the physical hardware processing elements to perform in parallel the element-wise sums of the specified portions of the matrix.
-
公开(公告)号:US20190019105A1
公开(公告)日:2019-01-17
申请号:US15649492
申请日:2017-07-13
Applicant: Facebook, Inc.
Inventor: Martin Schatz , Bradley Ray Green
Abstract: Systems, methods, and non-transitory computer readable media are configured to train a machine learning model. The training can be based on a training set of embeddings of a first type and a training set of embeddings of a second type. The machine learning model can be trained to receive an embedding of a second type and to output a corresponding embedding of the first type. A given embedding of the second type can be provided as input to the machine learning model. An embedding of the first type can be obtained from the machine learning model. The embedding of the first type can correspond to the given embedding of the second type.
-
公开(公告)号:US20210271451A1
公开(公告)日:2021-09-02
申请号:US16805339
申请日:2020-02-28
Applicant: Facebook, Inc.
Inventor: Krishnakumar Narayanan Nair , Rakesh Komuravelli , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian
Abstract: A processor system comprises two groups of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate matrix for each channel. Each register stores at least one data element from each matrix. The hardware channel convolution processor unit is configured to multiply each data element in a first and second portion of the first group of registers with a corresponding data element in the second group of registers to determine corresponding multiplication results and sum together the multiplication results for each specific channel to determine two corresponding channel convolution result data elements in a corresponding channel convolution result matrix.
-
公开(公告)号:US20210192359A1
公开(公告)日:2021-06-24
申请号:US16722636
申请日:2019-12-20
Applicant: Facebook, Inc.
Inventor: Ehsan Khish Ardestani Zadeh , Martin Schatz , Krishnakumar Narayanan Nair , Yuchen Hao , Abdulkadir Utku Diril , Rakesh Komuravelli
Abstract: The disclosed computer-implemented method may include (1) receiving, at a hardware accelerator that supports an ANN, an activation data set that is to undergo a convolution operation via a filter kernel of the ANN, (2) receiving, at the hardware accelerator, an argument indicating that the filter kernel exceeds at least one boundary of the activation data set when slid across a certain position during the convolution operation, (3) determining, based at least in part on the argument, that the hardware accelerator is to generate padding data at the boundary of the activation data set in connection with the certain position of the filter kernel, and then (4) performing, at the hardware accelerator, the convolution operation by processing a portion of the activation data set and the padding data when the filter kernel slides across the certain position. Various other systems and methods are also disclosed.
-
公开(公告)号:US20210042116A1
公开(公告)日:2021-02-11
申请号:US16533588
申请日:2019-08-06
Applicant: Facebook, Inc.
Inventor: Martin Schatz , Amin Firoozshahian
Abstract: A specification of an operation to perform one or more element-wise sums of specified portions of a matrix is received. The specification of the operation is analyzed to select a type of processing load partitioning to be applied. Based on the selected type of processing load partitioning to be applied, processing required to perform the operation is partitioned across a plurality of physical processing elements in parallel. The partitioned processing is distributed to the physical hardware processing elements to perform in parallel the element-wise sums of the specified portions of the matrix.
-
公开(公告)号:US20210334072A1
公开(公告)日:2021-10-28
申请号:US16855927
申请日:2020-04-22
Applicant: Facebook, Inc.
Inventor: Rakesh Komuravelli , Krishnakumar Narayanan Nair , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian
Abstract: A processor system comprises a plurality of dot product processor units and element-wise multiplication units. The dot product processor units perform a depthwise convolution of a data matrix with a separate depthwise convolution weight matrix for each data matrix channel. Each dot product processor unit performs at least a portion of the depthwise convolution for one or more data matrix channels. The element-wise multiplication units perform multiplication operations of a pointwise convolution. Each element-wise multiplication unit applies to each depthwise convolution partial result element received from one or more of the dot product processor units a corresponding data element from each of a plurality of pointwise convolution weight filters to determine element-wise multiplication unit results. The processor system sums together different groups of data elements from the element-wise multiplication unit results to at least in part calculate different data elements of a result of the pointwise convolution.
-
公开(公告)号:US20210319076A1
公开(公告)日:2021-10-14
申请号:US16843645
申请日:2020-04-08
Applicant: Facebook, Inc.
Inventor: Rakesh Komuravelli , Krishnakumar Narayanan Nair , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian
Abstract: A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element. The processing elements sum together a portion of the channel convolution result data elements from a group of different convolution processor units to determine a groupwise convolution result data element.
-
公开(公告)号:US20210294875A1
公开(公告)日:2021-09-23
申请号:US16826697
申请日:2020-03-23
Applicant: Facebook, Inc.
Inventor: Rakesh Komuravelli , Krishnakumar Narayanan Nair , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian
Abstract: A processor system comprises a hardware channel convolution processor unit and dot product processor unit. The channel convolution processor unit is configured to perform depthwise convolution, including by multiplying each data element of a first group of data elements of a convolution data matrix with a corresponding data element of a second group of data elements of a plurality of depthwise convolution weight matrices and summing together, for each specific channel, multiplication results corresponding to the specific channel to determine one corresponding result data element in a corresponding channel convolution result matrix to calculate a portion of depthwise convolution results. The dot product processor unit is configured to perform pointwise convolution, including applying pointwise weight matrices to the portion of depthwise convolution results to determine a portion of separable convolution results while at least another portion of the depthwise convolution results is being calculated by the processor system.
-
公开(公告)号:US20190034973A1
公开(公告)日:2019-01-31
申请号:US15660786
申请日:2017-07-26
Applicant: Facebook, Inc.
Inventor: Jinyi Yao , Martin Schatz , Arash Ashari , Vijay Rangarajan , Liushan Yang , Iris Yui Chang
Abstract: Systems, methods, and non-transitory computer-readable media can identify a target page and an advertising campaign comprising one or more advertisements associated with the target page. One or more users are identified for inclusion in a base audience based on page information associated with the target page. One or more users are identified for inclusion in an expanded audience based on expanded audience criteria. The advertising campaign is presented to a smart audience comprising the base audience and the expanded audience.
-
-
-
-
-
-
-
-
-