Patent search ap:("Arm Limited") AND inv:"Zhi-Gang Liu" Page 1

1.

发明授权
Multi-dimensional data path architecture 有权

公开(公告)号：US11693796B2

公开(公告)日：2023-07-04

申请号：US17334960

申请日：2021-05-31

Applicant: Arm Limited

Inventor： Paul Nicholas Whatmough , Zhi-Gang Liu , Supreet Jeloka , Saurabh Pijuskumar Sinha , Matthew Mattina

IPC: G06F13/16 , G06F13/40 , G06N3/063 , G06F7/544 , G06F15/80

CPC classification number: G06F13/1668 , G06F13/4004 , G06F7/5443 , G06F15/8046 , G06N3/063

Abstract: Various implementations described herein are directed to a device having a multi-layered logic structure with a first logic layer and a second logic layer arranged vertically in a stacked configuration. The device may have a memory array that provides data, and also, the device may have an inter-layer data bus that vertically couples the memory array to the multi-layered logic structure. The inter-layer data bus may provide multiple data paths to the first logic layer and the second logic layer for reuse of the data provided by the memory array.

2.

发明申请
Nibble Block Format 有权

公开(公告)号：US20230076138A1

公开(公告)日：2023-03-09

申请号：US17470470

申请日：2021-09-09

Applicant: Arm Limited

Inventor： Paul Nicholas Whatmough , Zhi-Gang Liu , Matthew Mattina

IPC: G06F17/16 , G06F7/544 , G06F7/523 , G06F7/50 , G06F9/50

Abstract: A matrix multiplication system and method are provided. The system includes a memory that stores one or more weight tensors, a processor and a matrix multiply accelerator (MMA). The processor converts each weight tensor into an encoded block set that is stored in the memory. Each encoded block set includes a number of encoded blocks, and each encoded block includes a data field and an index field. The MMA converts each encoded block set into a reconstructed weight tensor, and convolves each reconstructed weight tensor and an input data tensor to generate an output data matrix.

3.

发明申请
Artificial Neural Network Optical Hardware Accelerator 有权

公开(公告)号：US20210287078A1

公开(公告)日：2021-09-16

申请号：US16818302

申请日：2020-03-13

Applicant: Arm Limited

Inventor： Zhi-Gang Liu , Matthew Mattina , John Fremont Brown, III

IPC: G06N3/067 , G06N3/04 , G06N3/08

Abstract: The present disclosure advantageously provides an Optical Hardware Accelerator (OHA) for an Artificial Neural Network (ANN) that includes a communication bus interface, a memory, a controller, and an optical computing engine (OCE). The OCE is configured to execute an ANN model with ANN weights. Each ANN weight includes a quantized phase shift value θi and a phase shift value ϕi. The OCE includes a digital-to-optical (D/O) converter configured to generate input optical signals based on the input data, an optical neural network (ONN) configured to generate output optical signals based on the input optical signals, and an optical-to-digital (O/D) converter configured to generate the output data based on the output optical signals. The ONN includes a plurality of optical units (OUs), and each OU includes an optical multiply and accumulate (OMAC) module.

4.

发明公开
Bit Sparse Neural Network Optimization 审中-公开

公开(公告)号：US20240013052A1

公开(公告)日：2024-01-11

申请号：US17861824

申请日：2022-07-11

Applicant: Arm Limited

Inventor： Zhi-Gang Liu , Paul Nicholas Whatmough , John Fremont Brown, III

IPC: G06N3/08

CPC classification number: G06N3/082

Abstract: A method, system and apparatus provide bit-sparse neural network optimization. Rather than quantizing and pruning weight and activation elements at the word level, weight and activation elements are pruned at the bit level, which reduces the density of effective “set” bits in weight and activation data, which, advantageously, reduces the power consumption of the neural network inference process by reducing the degree of bit-level switching during inference.

5.

发明授权
Artificial neural network optical hardware accelerator 有权

公开(公告)号：US11526743B2

公开(公告)日：2022-12-13

申请号：US16818302

申请日：2020-03-13

Applicant: Arm Limited

Inventor： Zhi-Gang Liu , Matthew Mattina , John Fremont Brown, III

IPC: G06N3/067 , G06N3/04 , G06N3/08

Abstract: The present disclosure advantageously provides an Optical Hardware Accelerator (OHA) for an Artificial Neural Network (ANN) that includes a communication bus interface, a memory, a controller, and an optical computing engine (OCE). The OCE is configured to execute an ANN model with ANN weights. Each ANN weight includes a quantized phase shift value θi and a phase shift value ϕi. The OCE includes a digital-to-optical (D/O) converter configured to generate input optical signals based on the input data, an optical neural network (ONN) configured to generate output optical signals based on the input optical signals, and an optical-to-digital (O/D) converter configured to generate the output data based on the output optical signals. The ONN includes a plurality of optical units (OUs), and each OU includes an optical multiply and accumulate (OMAC) module.

6.

发明授权
Memory for an artificial neural network accelerator 有权

公开(公告)号：US11526305B2

公开(公告)日：2022-12-13

申请号：US17103629

申请日：2020-11-24

Applicant: Arm Limited

Inventor： Mudit Bhargava , Paul Nicholas Whatmough , Supreet Jeloka , Zhi-Gang Liu

IPC: G06F3/06 , G06N3/063

Abstract: A memory for an artificial neural network (ANN) accelerator is provided. The memory includes a first bank, a second bank and a bank selector. Each bank includes at least two word lines and a plurality of read word selectors. Each word line stores a plurality of words, and each word has a plurality of bytes. Each read word selector has a plurality of input ports and an output port, is coupled to a corresponding word in each word line, and is configured to select a byte of the corresponding word of a selected word line based on a byte select signal. The bank selector is coupled to the read word selectors of the first bank and the second bank, and configured to select a combination of read word selectors from at least one of the first bank and the second bank based on a bank select signal.

7.

发明授权
Pipelined accumulator 有权

公开(公告)号：US11501151B2

公开(公告)日：2022-11-15

申请号：US16885704

申请日：2020-05-28

Applicant: Arm Limited

Inventor： Paul Nicholas Whatmough , Zhi-Gang Liu , Matthew Mattina

IPC: G06N7/00 , G06N3/063

Abstract: The present disclosure advantageously provides a pipelined accumulator that includes a data selector configured to receive a sequence of operands to be summed, an input register coupled to the data selector, an output register, coupled to the data selector, configured to store a sequence of partial sums and output a final sum, and a multi-stage add module coupled to the input register and the output register. The multi-stage add module is configured to store a sequence of partial sums and a final sum in a redundant format, and perform back-to-back accumulation into the output register.

8.

发明授权
Matrix multiplication system, apparatus and method 有权

公开(公告)号：US11194549B2

公开(公告)日：2021-12-07

申请号：US16663887

申请日：2019-10-25

Applicant: Arm Limited

Inventor： Zhi-Gang Liu , Paul Nicholas Whatmough

IPC: G06F7/544 , G06F17/16 , G06F7/527

Abstract: The present disclosure advantageously provides a system, matrix multiply accelerator (MMA) and method for efficiently multiplying matrices. The MMA includes a vector register to store the row vectors of one input matrix, a vector register to store the column vectors of another input matrix, a vector register to store an output matrix, and an array of vector multiply and accumulate (VMAC) units coupled to the vector registers. Each VMAC unit is coupled to at least two row vector signal lines and at least two column vector signal lines, and is configured to calculate the dot product for one element i,j of the output matrix by multiplying each row vector formed from the ith row of the first matrix with a corresponding column vector formed from the jth column of the second matrix to generate intermediate products, and accumulate the intermediate products into a scalar value.

9.

发明申请
Matrix Multiplication System and Method 有权

公开(公告)号：US20210097130A1

公开(公告)日：2021-04-01

申请号：US16585265

申请日：2019-09-27

Applicant: Arm Limited

Inventor： Zhi-Gang Liu , Matthew Mattina , Paul Nicholas Whatmough

IPC: G06F17/16 , H03M7/30

Abstract: The present disclosure advantageously provides a system method for efficiently multiplying matrices with elements that have a value of 0. A bitmap is generated for each matrix. Each bitmap includes a bit position for each matrix element. The value of each bit is set to 0 when the value of the corresponding matrix element is 0, and to 1 when the value of the corresponding matrix element is not 0. Each matrix is compressed into a compressed matrix, which will have fewer elements with a value of 0 than the original matrix. Each bitmap is then adjusted based on the corresponding compressed matrix. The compressed matrices are then multiplied to generate an output matrix. For each element i,j in the output matrix, a dot product of the ith row of the first compressed matrix and the jth column of the second compressed matrix is calculated based on the bitmaps.

10.

发明授权
Memory for an artificial neural network accelerator 有权

公开(公告)号：US12086453B2

公开(公告)日：2024-09-10

申请号：US17103632

申请日：2020-11-24

Applicant: Arm Limited

Inventor： Mudit Bhargava , Paul Nicholas Whatmough , Supreet Jeloka , Zhi-Gang Liu

IPC: G11C16/04 , G06F3/06 , G06N3/063 , G11C11/54

CPC classification number: G06F3/0655 , G06F3/0604 , G06F3/0679 , G06N3/063 , G11C11/54

Abstract: A memory for an artificial neural network (ANN) accelerator is provided. The memory includes a first bank, a second bank and a bank selector. Each bank includes at least two word lines and a plurality of write word selectors. Each word line stores a plurality of words, and each word has a plurality of bytes. Each write word selector has an input port and a plurality of output ports, is coupled to a corresponding word in each word line, and is configured to select a byte of the corresponding word of a selected word line based on a byte select signal. The bank selector is coupled to the write word selectors of the first bank and the second bank, and configured to select a combination of write word selectors from at least one of the first bank and the second bank based on a bank select signal.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification