DETERMINISTIC TRAINING OF MACHINE LEARNING MODELS

    公开(公告)号:US20230351190A1

    公开(公告)日:2023-11-02

    申请号:US18219555

    申请日:2023-07-07

    Applicant: Google LLC

    CPC classification number: G06N3/084

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model using a deterministic data pipeline. One of the methods may include receiving a first request to generate a deterministic training dataset: transforming raw training examples obtained from the raw data source into pre-processed training examples; assigning a unique index to each pre-processed training example; and caching the pre-processed training examples into the cache directory specified in the received first request; receiving a second request to use the deterministic training dataset to train a machine learning model, the second request specifying a start index; and in response to receiving the second request: reading, from the cache directory, the pre-processed training examples that have indices beginning from the start index; and providing the read training examples in an order of the assigned indices for use in training the machine learning model.

    Deterministic training of machine learning models

    公开(公告)号:US12014276B2

    公开(公告)日:2024-06-18

    申请号:US18219555

    申请日:2023-07-07

    Applicant: Google LLC

    CPC classification number: G06N3/084 G06N3/08

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model using a deterministic data pipeline. One of the methods may include receiving a first request to generate a deterministic training dataset: transforming raw training examples obtained from the raw data source into pre-processed training examples; assigning a unique index to each pre-processed training example; and caching the pre-processed training examples into the cache directory specified in the received first request; receiving a second request to use the deterministic training dataset to train a machine learning model, the second request specifying a start index; and in response to receiving the second request: reading, from the cache directory, the pre-processed training examples that have indices beginning from the start index; and providing the read training examples in an order of the assigned indices for use in training the machine learning model.

    DETERMINISTIC TRAINING OF MACHINE LEARNING MODELS

    公开(公告)号:US20230316082A1

    公开(公告)日:2023-10-05

    申请号:US18130339

    申请日:2023-04-03

    Applicant: Google LLC

    CPC classification number: G06N3/084

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model using a deterministic data pipeline. One of the methods may include receiving a first request to generate a deterministic training dataset: transforming raw training examples obtained from the raw data source into pre-processed training examples; assigning a unique index to each pre-processed training example; and caching the pre-processed training examples into the cache directory specified in the received first request; receiving a second request to use the deterministic training dataset to train a machine learning model, the second request specifying a start index; and in response to receiving the second request: reading, from the cache directory, the pre-processed training examples that have indices beginning from the start index; and providing the read training examples in an order of the assigned indices for use in training the machine learning model.

Patent Agency Ranking