DETERMINING HYPERPARAMETERS USING SEQUENCE GENERATION NEURAL NETWORKS

    公开(公告)号:US20230401451A1

    公开(公告)日:2023-12-14

    申请号:US18199886

    申请日:2023-05-19

    Applicant: Google LLC

    CPC classification number: G06N3/0985 G06N3/0455

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model. One of the methods includes receiving metadata for the training, generating a metadata sequence that represents the metadata, at each of a plurality of iterations: generating one or more trials that each specify a respective value for each of a set of hyperparameters, comprising, for each trial: generating an input sequence for the iteration that comprises (i) the metadata sequence and (ii) for any earlier trials, a respective sequence that represents the respective values for the hyperparameters specified by the earlier trial and a measure of performance for the trial, and processing an input sequence for the trial that comprises the input sequence for the iteration using a sequence generation neural network to generate an output sequence that represents respective values for the hyperparameters.

Patent Agency Ranking