-
1.
公开(公告)号:US20240046065A1
公开(公告)日:2024-02-08
申请号:US17817142
申请日:2022-08-03
Applicant: Arm Limited
Inventor: Hokchhay Tann , Ramon Matas Navarro , Igor Fedorov , Chuteng Zhou , Paul Nicholas Whatmough , Matthew Mattina
IPC: G06N3/04
CPC classification number: G06N3/04
Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to determine options for decisions in connection with design features of a computing device. In a particular implementation, design options for two or more design decisions of neural network processing device may be identified based, at least in part, on combination of a definition of available computing resources and one or more predefined performance constraints.
-
公开(公告)号:US11886987B2
公开(公告)日:2024-01-30
申请号:US16451205
申请日:2019-06-25
Applicant: Arm Limited
Inventor: Shidhartha Das , Matthew Mattina , Glen Arnold Rosendale , Fernando Garcia Redondo
Abstract: A multiply-accumulate method and architecture are disclosed. The architecture includes a plurality of networks of non-volatile memory elements arranged in tiled columns. Logic digitally modulates the equivalent conductance of individual networks among the plurality of networks to map the equivalent conductance of each individual network to a single weight within the neural network. A first partial selection of weights within the neural network is mapped into the equivalent conductances of the networks in the columns to enable the computation of multiply-and-accumulate operations by mixed-signal computation. The logic updates the mappings to select a second partial selection of weights to compute additional multiply-and-accumulate operations and repeats the mapping and computation operations until all computations for the neural network are completed.
-
公开(公告)号:US11693796B2
公开(公告)日:2023-07-04
申请号:US17334960
申请日:2021-05-31
Applicant: Arm Limited
Inventor: Paul Nicholas Whatmough , Zhi-Gang Liu , Supreet Jeloka , Saurabh Pijuskumar Sinha , Matthew Mattina
CPC classification number: G06F13/1668 , G06F13/4004 , G06F7/5443 , G06F15/8046 , G06N3/063
Abstract: Various implementations described herein are directed to a device having a multi-layered logic structure with a first logic layer and a second logic layer arranged vertically in a stacked configuration. The device may have a memory array that provides data, and also, the device may have an inter-layer data bus that vertically couples the memory array to the multi-layered logic structure. The inter-layer data bus may provide multiple data paths to the first logic layer and the second logic layer for reuse of the data provided by the memory array.
-
公开(公告)号:US20230076138A1
公开(公告)日:2023-03-09
申请号:US17470470
申请日:2021-09-09
Applicant: Arm Limited
Inventor: Paul Nicholas Whatmough , Zhi-Gang Liu , Matthew Mattina
Abstract: A matrix multiplication system and method are provided. The system includes a memory that stores one or more weight tensors, a processor and a matrix multiply accelerator (MMA). The processor converts each weight tensor into an encoded block set that is stored in the memory. Each encoded block set includes a number of encoded blocks, and each encoded block includes a data field and an index field. The MMA converts each encoded block set into a reconstructed weight tensor, and convolves each reconstructed weight tensor and an input data tensor to generate an output data matrix.
-
公开(公告)号:US20230042271A1
公开(公告)日:2023-02-09
申请号:US17394048
申请日:2021-08-04
Applicant: Arm Limited
Inventor: Igor Fedorov , Ramon Matas Navarro , Chuteng Zhou , Hokchhay Tann , Paul Nicholas Whatmough , Matthew Mattina
Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to select options for decisions in connection with design features of a computing device. In a particular implementation, design options for two or more design decisions of neural network processing device may be selected based, at least in part, on combination of function values that are computed based, at least in part, on a tensor expressing sample neural network weights.
-
公开(公告)号:US20210287078A1
公开(公告)日:2021-09-16
申请号:US16818302
申请日:2020-03-13
Applicant: Arm Limited
Inventor: Zhi-Gang Liu , Matthew Mattina , John Fremont Brown, III
Abstract: The present disclosure advantageously provides an Optical Hardware Accelerator (OHA) for an Artificial Neural Network (ANN) that includes a communication bus interface, a memory, a controller, and an optical computing engine (OCE). The OCE is configured to execute an ANN model with ANN weights. Each ANN weight includes a quantized phase shift value θi and a phase shift value ϕi. The OCE includes a digital-to-optical (D/O) converter configured to generate input optical signals based on the input data, an optical neural network (ONN) configured to generate output optical signals based on the input optical signals, and an optical-to-digital (O/D) converter configured to generate the output data based on the output optical signals. The ONN includes a plurality of optical units (OUs), and each OU includes an optical multiply and accumulate (OMAC) module.
-
公开(公告)号:US11783163B2
公开(公告)日:2023-10-10
申请号:US16901542
申请日:2020-06-15
Applicant: Arm Limited
Inventor: Zhi-Gang Liu , Paul Nicholas Whatmough , Matthew Mattina
CPC classification number: G06N3/04 , G06F9/30105 , G06F17/16 , G06N3/08
Abstract: The present disclosure advantageously provides a matrix expansion unit that includes an input data selector, a first register set, a second register set, and an output data selector. The input data selector is configured to receive first matrix data in a columnwise format. The first register set is coupled to the input data selector, and includes a plurality of data selectors and a plurality of registers arranged in a first shift loop. The second register set is coupled to the data selector, and includes a plurality of data selectors and a plurality of registers arranged in a second shift loop. The output data selector is coupled to the first register set and the second register set, and is configured to output second matrix data in a rowwise format.
-
公开(公告)号:US20230026113A1
公开(公告)日:2023-01-26
申请号:US17382108
申请日:2021-07-21
Applicant: Arm Limited
Inventor: Paul Nicholas Whatmough , Zhi-Gang Liu , Matthew Mattina
Abstract: Example methods, devices and/or circuits to be implemented in a processing device to perform neural network-based computing operations. According to an embodiment, an accumulation of weighted activation input values may be computed on accumulation cycles at least in part by multiplying and/or scaling accumulated activation input values by an associated neural network weight.
-
公开(公告)号:US20220035890A1
公开(公告)日:2022-02-03
申请号:US17103676
申请日:2020-11-24
Applicant: Arm Limited
Inventor: Zhi-Gang Liu , Paul Nicholas Whatmough , Matthew Mattina
Abstract: A system and method for multiplying matrices are provided. The system includes a processor coupled to a memory and a matrix multiply accelerator (MMA) coupled to the processor. The MMA is configured to multiply, based on a bitmap, a compressed first matrix and a second matrix to generate an output matrix including, for each element i,j of the output matrix, calculate a dot product of an ith row of the compressed first matrix and a jth column of the second matrix based on the bitmap. Or, the MMA is configured to multiply, based on the bitmap, the second matrix and the compressed first matrix and to generate the output matrix including, for each element i,j of the output matrix, calculate a dot product of an ith row of the second matrix and a jth column of the compressed first matrix based on the bitmap.
-
公开(公告)号:US20210374509A1
公开(公告)日:2021-12-02
申请号:US16889031
申请日:2020-06-01
Applicant: Arm Limited
Inventor: Zhi-Gang Liu , Matthew Mattina
Abstract: The present disclosure advantageously provides a modulo operation unit that includes a first input configured to receive operand data, a second input configured to receive modulus data, an initial modulo stage, a sequence of intermediate modulo stages, and a final modulo stage.
-
-
-
-
-
-
-
-
-