Training network to minimize worst-case error

    公开(公告)号:US12051000B2

    公开(公告)日:2024-07-30

    申请号:US17962789

    申请日:2022-10-10

    IPC分类号: G06N3/08 G06N3/084 G06N7/01

    CPC分类号: G06N3/084 G06N7/01

    摘要: Some embodiments provide a method for configuring a machine-trained (MT) network that includes multiple configurable weights to train. The method propagates a set of inputs through the MT network to generate a set of output probability distributions. Each input has a corresponding expected output probability distribution. The method calculates a value of a continuously-differentiable loss function that includes a term approximating an extremum function of the difference between the expected output probability distributions and generated set of output probability distributions. The method trains the weights by back-propagating the calculated value of the continuously-differentiable loss function.

    Batch normalization for replicated layers of neural network

    公开(公告)号:US12045725B1

    公开(公告)日:2024-07-23

    申请号:US16923006

    申请日:2020-07-07

    摘要: Some embodiments provide a method for training a network including layers that each includes multiple nodes. The method identifies a set of related layers of the network. Each node in one of the related layers has corresponding nodes in each of the other related layers. Each set of corresponding nodes receives a same set of inputs and applies different sets of weights to the inputs to generate an output. The method identifies an element-wise addition layer including nodes that each add outputs of a different set of corresponding nodes from the related layers to generate a sum. The method uses a set of outputs generated by the nodes of each related layer to determine batch normalization parameters specific to each layer of the set of related layers. The method uses data generated by the element-wise addition layer to determine batch normalization parameters for the set of related layers.

    Storing of intermediate computed values for subsequent use in a machine trained network

    公开(公告)号:US11948067B1

    公开(公告)日:2024-04-02

    申请号:US17093278

    申请日:2020-11-09

    IPC分类号: G06N3/04 G06N3/049 G06N3/063

    CPC分类号: G06N3/049 G06N3/063

    摘要: Some embodiments of the invention provide a method for implementing a temporal convolution network (TCN) that includes several layers of machine-trained processing nodes. While processing one set of inputs that is provided to the TCN at a particular time, some of the processing nodes of the TCN use intermediate values computed by the processing nodes for other sets of inputs that were provided to the TCN at earlier times. To speed up the operation of the TCN and improve its efficiency, the method of some embodiments stores intermediate values computed by the TCN processing nodes for earlier sets of TCN inputs, so that these values can later be used for processing later set of TCN inputs.

    STORAGE OF INPUT VALUES ACROSS MULTIPLE CORES OF NEURAL NETWORK INFERENCE CIRCUIT

    公开(公告)号:US20240062054A1

    公开(公告)日:2024-02-22

    申请号:US18384576

    申请日:2023-10-27

    IPC分类号: G06N3/063 G06F9/30 G06F17/16

    摘要: Some embodiments provide a method for a neural network inference circuit that executes a neural network. The method loads a first set of inputs into an input buffer and computes a first dot product between the first set of inputs and a set of weights. The method shifts the first set of inputs in the buffer while loading a second set of inputs into the buffer such that a first subset of the first set of inputs is removed from the buffer, a second subset of the first set of inputs is moved to new locations in the buffer, and a second set of inputs are loaded into locations in the buffer vacated by the shifting. The method computes a second dot product between (i) the second set of inputs and the second subset of the first set of inputs and (ii) the set of weights.

    Preventing overfitting of hyperparameters during training of network

    公开(公告)号:US11610154B1

    公开(公告)日:2023-03-21

    申请号:US16780841

    申请日:2020-02-03

    IPC分类号: G06N20/00 G06N5/02

    摘要: Some embodiments provide a method for training a machine-trained (MT) network. The method uses a first set of inputs to train parameters of the MT network according to a set of hyperparameters that define aspects of the training. The method uses a second set of inputs to validate the MT network as trained by the first set of inputs. Based on the validation, the method modifies the hyperparameters for subsequent training of the MT network, wherein the hyperparameter modification is constrained to prevent overfitting of the modified hyperparameters to the second set of inputs.

    Replication of neural network layers

    公开(公告)号:US11604973B1

    公开(公告)日:2023-03-14

    申请号:US16698942

    申请日:2019-11-27

    摘要: Some embodiments provide a method for training parameters of a machine-trained (MT) network. The method receives an MT network with multiple layers of nodes, each of which computes an output value based on a set of input values and a set of trained weight values. Each layer has a set of allowed weight values. For a first layer with a first set of allowed weight values, the method defines a second layer with nodes corresponding to each of the nodes of the first layer, each second-layer node receiving the same input values as the corresponding first-layer node. The second layer has a second, different set of allowed weight values, with the output values of the nodes of the first layer added with the output values of the corresponding nodes of the second layer to compute output values that are passed to a subsequent layer. The method trains the weight values.

    Training sparse networks with discrete weight values

    公开(公告)号:US11537870B1

    公开(公告)日:2022-12-27

    申请号:US15921622

    申请日:2018-03-14

    IPC分类号: G06N3/08 H04L1/24 G06N7/00

    摘要: Some embodiments provide a method for training a machine-trained (MT) network. The method propagates multiple inputs through the MT network to generate an output for each of the inputs. each of the inputs is associated with an expected output, the MT network uses multiple network parameters to process the inputs, and each network parameter of a set of the network parameters is defined during training as a probability distribution across a discrete set of possible values for the network parameter. The method calculates a value of a loss function for the MT network that includes (i) a first term that measures network error based on the expected outputs compared to the generated outputs and (ii) a second term that penalizes divergence of the probability distribution for each network parameter in the set of network parameters from a predefined probability distribution for the network parameter.

    Machine-trained network for misalignment-insensitive depth perception

    公开(公告)号:US11373325B1

    公开(公告)日:2022-06-28

    申请号:US16591591

    申请日:2019-10-02

    摘要: Some embodiments of the invention provide a novel method for training a multi-layer node network to reliably determine depth based on a plurality of input sources (e.g., cameras, microphones, etc.) that may be arranged with deviations from an ideal alignment or placement. Some embodiments train the multi-layer network using a set of inputs generated with random misalignments incorporated into the training set. In some embodiments, the training set includes (i) a synthetically generated training set based on a three-dimensional ground truth model as it would be sensed by a sensor array from different positions and with different deviations from ideal alignment and placement, and/or (ii) a training set generated by a set of actual sensor arrays augmented with an additional sensor (e.g., additional camera or time of flight measurement device such as lidar) to collect ground truth data.

    Computation of neural network node

    公开(公告)号:US11341397B1

    公开(公告)日:2022-05-24

    申请号:US16212618

    申请日:2018-12-06

    摘要: Some embodiments provide a method for a neural network inference circuit (NNIC) that implements a neural network including multiple computation nodes at multiple layers. Each computation node includes a dot product of input values and weight values and a set of post-processing operations. The method retrieves a set of weight values and a set of input values for a computation node from a set of memories of the NNIC. The method computes a dot product of the retrieved sets of weight values and input values. The method performs the post-processing operations for the computation node on a result of the dot product computation to compute an output value for the computation node. The method stores the output value in the set of memories. No intermediate results of the dot product or the set of post-processing operations are stored in any RAM of the NNIC during the computation.

    Neural network inference circuit
    10.
    发明授权

    公开(公告)号:US11205115B1

    公开(公告)日:2021-12-21

    申请号:US16212622

    申请日:2018-12-06

    摘要: Some embodiments provide a neural network inference circuit (NNIC) for implementing a neural network that includes multiple computation nodes at multiple layers. Each of a set of the computation nodes includes a dot product of input values and weight values. The NNIC includes multiple dot product core circuits for computing multiple partial dot products and a set of channel circuits connecting the core circuits. The set of channel circuits includes (i) a dot product bus for aggregating the partial dot products to compute dot products for computation nodes of the neural network, (ii) one or more post-processing circuits for performing additional computation operations on the dot products to compute outputs for the computation nodes, and (iii) an output bus for providing the computed outputs of the computation nodes to the core circuits for the core circuits to use as inputs for subsequent computation nodes.