-
公开(公告)号:US20240220779A1
公开(公告)日:2024-07-04
申请号:US18527063
申请日:2023-12-01
Applicant: Meta Platforms Technologies, LLC
Inventor: Vignesh Vivekraja , Tomonari Tohara , Reza Tusi , Abuduwaili Tuoheti , Javid Jaffari , Vlad Fruchter , David Vakrat , Ohad Meitav
IPC: G06N3/0464 , G06F7/544 , G06F17/15 , H03H17/02
CPC classification number: G06N3/0464 , G06F7/5443 , G06F17/153 , H03H17/02
Abstract: In one embodiment, a system comprising a processor and a non-transitory memory coupled to the processor comprising instructions executable by the processor. The processor, comprising an internal memory; a Multiply-Accumulate (MAC) array; a first vector register array; a second vector register array; and a third vector register array, is operable when executing instructions to transfer weights for M filters and an input activation tensor from an external memory to the internal memory, insert paddings to the input activation tensor in the internal memory based on first configuration parameters, configure the MAC array to a required shape based on second configuration parameters for convolution operations between the input activation tensor and the M filters, and calculate a row of the output activation tensor by performing the convolution operations on corresponding R rows of the input activation tensor with the M filters, wherein R is a filter height.
-
2.
公开(公告)号:US20240220273A1
公开(公告)日:2024-07-04
申请号:US18527004
申请日:2023-12-01
Applicant: Meta Platforms Technologies, LLC
Inventor: Vignesh Vivekraja , Tomonari Tohara , Reza Tusi , Abuduwaili Tuoheti , Javid Jaffari , Vlad Fruchter , David Vakrat , Ohad Meitav
CPC classification number: G06F9/3893 , G06F9/3001 , G06F9/3012
Abstract: In one embodiment, a system comprising a processor and a non-transitory memory coupled to the processor comprising instructions executable by the processor. The processor, comprising an internal memory; a Multiply-Accumulate (MAC) array; a first vector register array; a second vector register array; and a third vector register array, is operable when executing a first instruction among the instructions to feed a weight vector array from the second vector register array to the MAC array, broadcast an input activation vector to the MAC array, multiply an input activation value broadcast to the MAC unit from the input activation vector and a weight value fed to the MAC unit from the weight vector array at each MAC unit in the MAC array, and store a partial output activation vector to the third vector register array, wherein the partial output activation vector is the output of the MAC array.
-
公开(公告)号:US20240220259A1
公开(公告)日:2024-07-04
申请号:US18525083
申请日:2023-11-30
Applicant: Meta Platforms Technologies, LLC
Inventor: Tomonari Tohara , Vignesh Vivekraja , Alagappan Valliappan , Andrey Bushev , Javid Jaffari
IPC: G06F9/30
CPC classification number: G06F9/30178 , G06F9/30038 , G06F9/30134
Abstract: In one embodiment, a computing system may set data to a first group of registers. The first group of registers may be configured to be accessed during a single operation cycle. The system may set a number of patterns to a second group of registers. Each pattern of the number of patterns may include an array of index for the data stored in the first group of registers. The system may select, for a first vector register associated with a vector engine, a first pattern from the patterns stored in the second group of registers. The system may load a first portion of the data from the first group of registers to the first vector register based on the first pattern selected for the first vector register from the patterns stored in the second group of registers.
-
公开(公告)号:US20240220281A1
公开(公告)日:2024-07-04
申请号:US18525443
申请日:2023-11-30
Applicant: Meta Platforms Technologies, LLC
Inventor: Vignesh Vivekraja , Tomonari Tohara , Reza Tusi , Abuduwaili Tuoheti , Weiping Liu , Javid Jaffari
IPC: G06F9/445 , G06N3/0464
CPC classification number: G06F9/44505 , G06N3/0464
Abstract: In one embodiment, a method includes accessing a computational graph representing computations to be executed on a computing system comprising a plurality of Execution Units (EUs), identifying a set of candidate mapped-graphs for the computational graph, where each node in a candidate mapped-graph is mapped to an EU capable of calculating the node, ensuring that each edge from a first node to a second node in each candidate mapped-graph satisfies memory constraints, determining an expected cost for executing each candidate mapped-graph using mapped-EUs in the candidate mapped-graph for calculating respective nodes, and selecting a candidate mapped-graph with a least expected cost from the set of candidate mapped-graphs.
-
公开(公告)号:US20240220256A1
公开(公告)日:2024-07-04
申请号:US18525217
申请日:2023-11-30
Applicant: Meta Platforms Technologies, LLC
Inventor: Reza Tusi , Tomonari Tohara , Vignesh Vivekraja , Javid Jaffari
CPC classification number: G06F9/3013 , G06F9/3887
Abstract: In one embodiment, a computing system may load data from a memory unit into a number of registers according to a first order by which the data is arranged. The registers may be configured to be accessed during a single operation cycle. The system may determine a second order for the data based on one or more subsequent operations to process the data. The system may read the data from the registers according to the second order during one or more operation cycles. The data read from the registers may be arranged in the second order. The system may transmit the data arranged in the second order to an execution unit configured to execute the one or more subsequent operations to process the data arranged in the second order.
-
-
-
-