Neural architecture search
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network that should be active during the processing of inputs by the large neural network; for each output sequence in the batch: determining a performance metric of the large neural network on the particular neural network task (i) in accordance with current values of the large network parameters and (ii) with only the subset of components specified by the output sequences active; and using the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters of the controller neural network.
Public/Granted literature
Information query
Patent Agency Ranking
0/0