INSTRUCTION SET FOR HYBRID CPU AND ANALOG IN-MEMORY ARTIFICIAL INTELLIGENCE PROCESSOR

    公开(公告)号:US20200242459A1

    公开(公告)日:2020-07-30

    申请号:US16262583

    申请日:2019-01-30

    Abstract: Techniques are provided for implementing a hybrid processing architecture comprising a general-purpose processor (CPU) and a neural processing unit (NPU), coupled to an analog in-memory artificial intelligence (AI) processor. According to an embodiment, the hybrid processor implements an AI instruction set including instructions to perform analog in-memory computations. The AI processor comprises one or more layers, the NN layers including memory circuitry and analog processing circuitry. The memory circuitry is configured to store the weighting factors and the input data. The analog processing circuitry is configured to perform analog calculations on the stored weighting factors and the stored input data in accordance with the execution, by the NPU, of instruction from the AI instruction set. The AI instruction set includes instructions to perform dot products, multiplication, differencing, normalization, pooling, thresholding, transposition, and backpropagation training. The NN layers are configured as convolutional NN layers and/or fully connected NN layers.

    Programmable interface to in-memory cache processor

    公开(公告)号:US10705967B2

    公开(公告)日:2020-07-07

    申请号:US16160270

    申请日:2018-10-15

    Abstract: The present disclosure is directed to systems and methods of implementing a neural network using in-memory mathematical operations performed by pipelined SRAM architecture (PISA) circuitry disposed in on-chip processor memory circuitry. A high-level compiler may be provided to compile data representative of a multi-layer neural network model and one or more neural network data inputs from a first high-level programming language to an intermediate domain-specific language (DSL). A low-level compiler may be provided to compile the representative data from the intermediate DSL to multiple instruction sets in accordance with an instruction set architecture (ISA), such that each of the multiple instruction sets corresponds to a single respective layer of the multi-layer neural network model. Each of the multiple instruction sets may be assigned to a respective SRAM array of the PISA circuitry for in-memory execution. Thus, the systems and methods described herein beneficially leverage the on-chip processor memory circuitry to perform a relatively large number of in-memory vector/tensor calculations in furtherance of neural network processing without burdening the processor circuitry.

    Weight prefetch for in-memory neural network execution

    公开(公告)号:US11347994B2

    公开(公告)日:2022-05-31

    申请号:US16160466

    申请日:2018-10-15

    Abstract: The present disclosure is directed to systems and methods of bit-serial, in-memory, execution of at least an nth layer of a multi-layer neural network in a first on-chip processor memory circuitry portion contemporaneous with prefetching and storing layer weights associated with the (n+1)st layer of the multi-layer neural network in a second on-chip processor memory circuitry portion. The storage of layer weights in on-chip processor memory circuitry beneficially decreases the time required to transfer the layer weights upon execution of the (n+1)st layer of the multi-layer neural network by the first on-chip processor memory circuitry portion. In addition, the on-chip processor memory circuitry may include a third on-chip processor memory circuitry portion used to store intermediate and/or final input/output values associated with one or more layers included in the multi-layer neural network.

    Efficient analog in-memory matrix multiplication processor

    公开(公告)号:US11294985B2

    公开(公告)日:2022-04-05

    申请号:US16175229

    申请日:2018-10-30

    Abstract: Techniques are provided for efficient matrix multiplication using in-memory analog parallel processing, with applications for neural networks and artificial intelligence processors. A methodology implementing the techniques according to an embodiment includes storing two matrices in-memory. The first matrix is stored in transposed form such that the transposed first matrix has the same number of rows as the second matrix. The method further includes reading columns of the matrices from the memory in parallel, using disclosed bit line functional read operations and cross bit line functional read operations, which are employed to generate analog dot products between the columns. Each of the dot products corresponds to an element of the matrix multiplication product of the two matrices. In some embodiments, one of the matrices may be used to store neural network weighting factors, and the other matrix may be used to store input data to be processed by the neural network.

Patent Agency Ranking