Neural network compute tile
    2.
    发明授权

    公开(公告)号:US11816480B2

    公开(公告)日:2023-11-14

    申请号:US17892807

    申请日:2022-08-22

    Applicant: Google LLC

    Abstract: A computing unit is disclosed, comprising a first memory bank for storing input activations and a second memory bank for storing parameters used in performing computations. The computing unit includes at least one cell comprising at least one multiply accumulate (“MAC”) operator that receives parameters from the second memory bank and performs computations. The computing unit further includes a first traversal unit that provides a control signal to the first memory bank to cause an input activation to be provided to a data bus accessible by the MAC operator. The computing unit performs one or more computations associated with at least one element of a data array, the one or more computations being performed by the MAC operator and comprising, in part, a multiply operation of the input activation received from the data bus and a parameter received from the second memory bank.

    Low-power adder circuit
    4.
    发明授权

    公开(公告)号:US11586701B2

    公开(公告)日:2023-02-21

    申请号:US17063813

    申请日:2020-10-06

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a circuit configured to add multiple inputs. The circuit includes a first adder section that receives a first input and a second input and adds the inputs to generate a first sum. The circuit also includes a second adder section that receives the first and second inputs and adds the inputs to generate a second sum. An input processor of the circuit receives the first and second inputs, determines whether a relationship between the first and second inputs satisfies a set of conditions, and selects a high-power mode of the adder circuit or a low-power mode of the adder circuit using the determined relationship between the first and second inputs. The high-power mode is selected and the first and second inputs are routed to the second adder section when the relationship satisfies the set of conditions.

    EXPLOITING INPUT DATA SPARSITY IN NEURAL NETWORK COMPUTE UNITS

    公开(公告)号:US20220083480A1

    公开(公告)日:2022-03-17

    申请号:US17410071

    申请日:2021-08-24

    Applicant: Google LLC

    Abstract: A computer-implemented method includes receiving, by a computing device, input activations and determining, by a controller of the computing device, whether each of the input activations has either a zero value or a non-zero value. The method further includes storing, in a memory bank of the computing device, at least one of the input activations. Storing the at least one input activation includes generating an index comprising one or more memory address locations that have input activation values that are non-zero values. The method still further includes providing, by the controller and from the memory bank, at least one input activation onto a data bus that is accessible by one or more units of a computational array. The activations are provided, at least in part, from a memory address location associated with the index.

    Hardware double buffering using a special purpose computational unit

    公开(公告)号:US11099772B2

    公开(公告)日:2021-08-24

    申请号:US16700385

    申请日:2019-12-02

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including an apparatus for transferring data using multiple buffers, including multiple memories and one or more processing units configured to determine buffer memory addresses for a sequence of data elements stored in a first data storage location that are being transferred to a second data storage location. For each group of one or more of the data elements in the sequence, a value of a buffer assignment element that can be switched between multiple values each corresponding to a different one of the memories is identified. A buffer memory address for the group of one or more data elements is determined based on the value of the buffer assignment element. The value of the buffer assignment element is switched prior to determining the buffer memory address for a subsequent group of one or more data elements of the sequence of data elements.

    VIRTUALIZING EXTERNAL MEMORY AS LOCAL TO A MACHINE LEARNING ACCELERATOR

    公开(公告)号:US20200342350A1

    公开(公告)日:2020-10-29

    申请号:US16397481

    申请日:2019-04-29

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for virtualizing external memory as local to a machine learning accelerator. One ambient computing system comprises: an ambient machine learning engine; a low-power CPU; and an SRAM that is shared among at least the ambient machine learning engine and the low-power CPU; wherein the ambient machine learning engine comprises virtual address logic to translate from virtual addresses generated by the ambient machine learning engine to physical addresses within the SRAM.

    MATRIX PROCESSING APPARATUS
    8.
    发明申请

    公开(公告)号:US20200012705A1

    公开(公告)日:2020-01-09

    申请号:US16571749

    申请日:2019-09-16

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including a system for transforming sparse elements to a dense matrix. The system is configured to receive a request for an output matrix based on sparse elements including sparse elements associated with a first dense matrix and sparse elements associated with a second dense matrix; obtain the sparse elements associated with the first dense matrix fetched by a first group of sparse element access units; obtain the sparse elements associated with the second dense matrix fetched by a second group of sparse element access units; and transform the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix to generate the output dense matrix that includes the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix.

    Exploiting input data sparsity in neural network compute units

    公开(公告)号:US10360163B2

    公开(公告)日:2019-07-23

    申请号:US15336066

    申请日:2016-10-27

    Applicant: Google LLC

    Abstract: A computer-implemented method includes receiving, by a computing device, input activations and determining, by a controller of the computing device, whether each of the input activations has either a zero value or a non-zero value. The method further includes storing, in a memory bank of the computing device, at least one of the input activations. Storing the at least one input activation includes generating an index comprising one or more memory address locations that have input activation values that are non-zero values. The method still further includes providing, by the controller and from the memory bank, at least one input activation onto a data bus that is accessible by one or more units of a computational array. The activations are provided, at least in part, from a memory address location associated with the index.

    APPARATUS AND MECHANISM FOR PROCESSING NEURAL NETWORK TASKS USING A SINGLE CHIP PACKAGE WITH MULTIPLE IDENTICAL DIES

    公开(公告)号:US20190156187A1

    公开(公告)日:2019-05-23

    申请号:US15819753

    申请日:2017-11-21

    Applicant: Google LLC

    Abstract: Apparatus and methods for processing neural network models are provided. The apparatus can comprise a plurality of identical artificial intelligence processing dies. Each artificial intelligence processing die among the plurality of identical artificial intelligence processing dies can include at least one inter-die input block and at least one inter-die output block. Each artificial intelligence processing die among the plurality of identical artificial intelligence processing dies is communicatively coupled to another artificial intelligence processing die among the plurality of identical artificial intelligence processing dies by way of one or more communication paths from the at least one inter-die output block of the artificial intelligence processing die to the at least one inter-die input block of the artificial intelligence processing die. Each artificial intelligence processing die among the plurality of identical artificial intelligence processing dies corresponds to at least one layer of a neural network.

Patent Agency Ranking