Patent search ap:("Meta Platforms Technologies Page LLC") AND inv:"Pierce I-Jen Chuang"

1.

发明授权
Systems and methods for reading and writing sparse data in a neural network accelerator 有权

公开(公告)号：US11954025B2

公开(公告)日：2024-04-09

申请号：US18126228

申请日：2023-03-24

Applicant: Meta Platforms Technologies, LLC

Inventor： Ganesh Venkatesh , Liangzhen Lai , Pierce I-Jen Chuang , Meng Li

IPC: G06F12/04 , G06N3/08

CPC classification number: G06F12/04 , G06F2212/1028 , G06N3/08

Abstract: Disclosed herein includes a system, a method, and a device for reading and writing sparse data in a neural network accelerator. A mask identifying byte positions within a data word having non-zero values in memory can be accessed. Each bit of the mask can have a first value or a second value, the first value indicating that a byte of the data word corresponds to a non-zero byte value, the second value indicating that the byte of the data word corresponds to a zero byte value. The data word can be modified to have non-zero byte values stored at an end of a first side of the data word in the memory, and any zero byte values stored in a remainder of the data word. The modified data word can be written to the memory via at least a first slice of a plurality of slices that is configured to access the first side of the data word in the memory.

2.

发明授权
Systems and methods for speech or text processing using matrix operations 有权

公开(公告)号：US11899745B1

公开(公告)日：2024-02-13

申请号：US16997401

申请日：2020-08-19

Applicant: Meta Platforms Technologies, LLC

Inventor： Alagappan Valliappan , Ganesh Venkatesh , Pierce I-Jen Chuang

IPC: G06F17/16 , G06F7/544

CPC classification number: G06F17/16 , G06F7/5443

Abstract: Disclosed herein includes a system, a method, and a device for processing and converting data using matrix operations. Circuitry can partition an input of a first data format across a plurality of lookup tables each residing in a respective memory. The circuitry can access weight information from a load store memory, and the partitioned input on a per column basis from the plurality of lookup tables. The circuitry can perform a number of multiply-accumulate (MAC) operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables. The number of MAC operations performed per cycle can correspond to a total number of columns of the plurality of lookup tables. The circuitry can generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.

3.

发明授权
Systems and methods for distributing a neural network across multiple computing devices 有权

公开(公告)号：US11698529B2

公开(公告)日：2023-07-11

申请号：US16506479

申请日：2019-07-09

Applicant: Meta Platforms Technologies, LLC

Inventor： Liangzhen Lai , Pierce I-Jen Chuang , Vikas Chandra , Ganesh Venkatesh

IPC: G02B27/01 , H04N13/106 , G06N3/04 , G06N3/045

CPC classification number: G02B27/017 , G06N3/04 , G06N3/045 , H04N13/106 , G02B2027/014 , G02B2027/0138

Abstract: Disclosed herein is a method for using a neural network across multiple devices. The method can include receiving, by a first device configured with a first one or more layers of a neural network, input data for processing via the neural network implemented across the first device and a second device. The method can include outputting, by the first one or more layers of the neural network implemented on the first device, a data set that is reduced in size relative to the input data while identifying one or more features of the input data for processing by a second one or more layers of the neural network. The method can include communicating, by the first device, the data set to the second device for processing via the second one or more layers of the neural network implemented on the second device.

4.

发明授权
System and method for performing small channel count convolutions in energy-efficient input operand stationary accelerator 有权

公开(公告)号：US11675998B2

公开(公告)日：2023-06-13

申请号：US16511544

申请日：2019-07-15

Applicant: Meta Platforms Technologies, LLC

Inventor： Ganesh Venkatesh , Liangzhen Lai , Pierce I-Jen Chuang , Meng Li

IPC: G06N3/04 , G06N3/063

CPC classification number: G06N3/04 , G06N3/063

Abstract: Disclosed herein includes a system, a method, and a device for receiving input data to generate a plurality of outputs for a layer of a neural network. The plurality of outputs are arranged in a first array. Dimensions of the first array may be compared with dimensions of a processing unit (PE) array including a plurality of PEs. According to a result of the comparing, the first array is partitioned into subarrays by the processor. Each of the subarrays has dimensions less than or equal to the dimensions of the PE array. A first group of PEs in the PE array is assigned to a first one of the subarrays. A corresponding output of the plurality of outputs is generated using a portion of the input data by each PE of the first group of PEs assigned to the first one of the subarrays.

5.

发明公开
SYSTEMS AND METHODS FOR SPEECH OR TEXT PROCESSING USING MATRIX OPERATIONS 审中-公开

公开(公告)号：US20240152575A1

公开(公告)日：2024-05-09

申请号：US18414901

申请日：2024-01-17

Applicant: Meta Platforms Technologies, LLC

Inventor： Alagappan Valliappan , Pierce I-Jen Chuang , Ganesh Venkatesh

IPC: G06F17/16

CPC classification number: G06F17/16 , G06F7/5443

Abstract: Disclosed herein includes a system, a method, and a device for processing and converting data using matrix operations. Circuitry can partition an input of a first data format across a plurality of lookup tables each residing in a respective memory. The circuitry can access weight information from a load store memory, and the partitioned input on a per column basis from the plurality of lookup tables. The circuitry can perform a number of multiply-accumulate (MAC) operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables. The number of MAC operations performed per cycle can correspond to a total number of columns of the plurality of lookup tables. The circuitry can generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.

6.

发明公开
SYSTEMS AND METHODS FOR READING AND WRITING SPARSE DATA IN A NEURAL NETWORK ACCELERATOR 审中-公开

公开(公告)号：US20230229591A1

公开(公告)日：2023-07-20

申请号：US18126228

申请日：2023-03-24

Applicant: Meta Platforms Technologies, LLC

Inventor： Ganesh Venkatesh , Liangzhen Lai , Pierce I-Jen Chuang , Meng Li

IPC: G06F12/04

CPC classification number: G06F12/04 , G06F2212/1028 , G06N3/08

Abstract: Disclosed herein includes a system, a method, and a device for reading and writing sparse data in a neural network accelerator. A mask identifying byte positions within a data word having non-zero values in memory can be accessed. Each bit of the mask can have a first value or a second value, the first value indicating that a byte of the data word corresponds to a non-zero byte value, the second value indicating that the byte of the data word corresponds to a zero byte value. The data word can be modified to have non-zero byte values stored at an end of a first side of the data word in the memory, and any zero byte values stored in a remainder of the data word. The modified data word can be written to the memory via at least a first slice of a plurality of slices that is configured to access the first side of the data word in the memory.

7.

发明授权
Systems and methods for reading and writing sparse data in a neural network accelerator 有权

公开(公告)号：US11630770B2

公开(公告)日：2023-04-18

申请号：US16509138

申请日：2019-07-11

Applicant: Meta Platforms Technologies, LLC

Inventor： Ganesh Venkatesh , Liangzhen Lai , Pierce I-Jen Chuang , Meng Li

IPC: G06F12/04 , G06N3/08

Abstract: Disclosed herein includes a system, a method, and a device for reading and writing sparse data in a neural network accelerator. A plurality of slices can be established to access a memory having an access size of a data word. A first slice can be configured to access a first side of the data word in memory. Circuitry can access a mask identifying byte positions within the data word having non-zero values. The circuitry can modify the data word to have non-zero byte values stored starting at an end of the first side, and any zero byte values stored in a remainder of the data word. A determination can be made whether a number of non-zero byte values is less than or equal to a first access size of the first slice. The circuitry can write the modified data word to the memory via at least the first slice.

8.

发明授权
Efficient multiply-accumulation based on sparse matrix 有权

公开(公告)号：US11429394B2

公开(公告)日：2022-08-30

申请号：US16997460

申请日：2020-08-19

Applicant: Meta Platforms Technologies, LLC

Inventor： Alagappan Valliappan , Ganesh Venkatesh , Pierce I-Jen Chuang

IPC: G06F9/30 , G06F17/16 , G06F7/544 , G06F9/38 , G10L15/22

Abstract: Disclosed herein includes improving computational efficiency of multiply-accumulate (MAC) operation. In one aspect, a computing device identifies, a first vector including non-zero elements of a base matrix, and a second vector indicating a location of each of the non-zero elements of the base matrix. In one aspect, the device determines a first element and a second element of the first vector. In one aspect, the device determines a third element and a fourth element of the second vector. In one aspect, the device determines i) a fifth element of an input vector according to the third element of the second vector, and ii) a sixth element of the input vector according to the fourth element of the second vector. In one aspect, the device causes a MAC circuitry to perform a dot product according to the first element, the second element, the fifth element, and the sixth element.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification