Abstract:
A concept for training a neural network model. The concept comprises receiving training data and test data, each comprising a set of annotated images. A neural network model is trained using the training data with an initial regularization parameter. Loss functions of the neural network for both the training data and the test data are used to modify the regularization parameter, and the neural network model is retrained using the modified regularization parameter. This process is iteratively repeated until the loss functions both converge. A system, method and a computer program product embodying this concept are disclosed.
Abstract:
A computer-based system trains a neural network by solving a double layer optimization problem. The system includes an input interface to receive an input to the neural network and labels of the input to the neural network; a processor to solve a double layer optimization to produce parameters of the neural network, and an output interface to output the parameters of the neural network. The double layer optimization includes an optimization of a first layer subject to an optimization of a second layer. The optimization of the first layer minimizes a difference between an output of the neural network processing the input and the labels of the input to the neural network, the optimization of the second layer minimizes a distance between a non-negative output vector of each layer and a corresponding input vector to each layer. The input vector of a current layer is a linear transformation of the non-negative output vector of the previous layer.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training and deploying machine-learned communication over radio frequency (RF) channels. One of the methods includes: determining first information; using an encoder machine-learning network to process the first information and generate a first RF signal for transmission through a communication channel; determining a second RF signal that represents the first RF signal having been altered by transmission through the communication channel; using a decoder machine-learning network to process the second RF signal and generate second information as a reconstruction of the first information; calculating a measure of distance between the second information and the first information; and updating at least one of the encoder machine-learning network or the decoder machine-learning network based on the measure of distance between the second information and the first information.
Abstract:
A computer-implemented information processing method for an inference phase of a convolution neural network, the method including steps of: generating a list of non-zero elements from a learned sparse kernel to be used for a convolution layer of the convolution neural network; when performing convolution on an input feature map, loading only elements of the input feature map which correspond to the non-zero elements of the generated list; and performing convolution arithmetic operations using the loaded elements of the input data map and the non-zero elements of the list, thereby reducing the number of operations necessary to generate an output feature map of the convolution layer.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a larger neural network from a smaller neural network. In one aspect, a method includes obtaining data specifying an original neural network; generating a larger neural network from the original neural network, wherein the larger neural network has a larger neural network structure including the plurality of original neural network units and a plurality of additional neural network units not in the original neural network structure; initializing values of the parameters of the original neural network units and the additional neural network units so that the larger neural network generates the same outputs from the same inputs as the original neural network; and training the larger neural network to determine trained values of the parameters of the original neural network units and the additional neural network units from the initialized values.
Abstract:
A method of reducing image resolution in a deep convolutional network (DCN) includes dynamically selecting a reduction factor to be applied to an input image. The reduction factor can be selected at each layer of the DCN. The method also includes adjusting the DCN based on the reduction factor selected for each layer.
Abstract:
The adaptation and personalization of a deep neural network (DNN) model for automatic speech recognition is provided. An utterance which includes speech features for one or more speakers may be received in ASR tasks such as voice search or short message dictation. A decomposition approach may then be applied to an original matrix in the DNN model. In response to applying the decomposition approach, the original matrix may be converted into multiple new matrices which are smaller than the original matrix. A square matrix may then be added to the new matrices. Speaker-specific parameters may then be stored in the square matrix. The DNN model may then be adapted by updating the square matrix. This process may be applied to all of a number of original matrices in the DNN model. The adapted DNN model may include a reduced number of parameters than those received in the original DNN model.
Abstract:
Methods and apparatus are provided for determining synapses in an artificial nervous system based on connectivity patterns. One example method generally includes determining, for an artificial neuron, an event has occurred; based on the event, determining one or more synapses with other artificial neurons based on a connectivity pattern associated with the artificial neuron; and applying a spike from the artificial neuron to the other artificial neurons based on the determined synapses. In this manner, the connectivity patterns (or parameters for determining such patterns) for particular neuron types, rather than the connectivity itself, may be stored. Using the stored information, synapses may be computed on the fly, thereby reducing memory consumption and increasing memory bandwidth. This also saves time during artificial nervous system updates.