Consistent filtering of machine learning data

    公开(公告)号:US10540606B2

    公开(公告)日:2020-01-21

    申请号:US14460314

    申请日:2014-08-14

    Abstract: Consistency metadata, including a parameter for a pseudo-random number source, are determined for training-and-evaluation iterations of a machine learning model. Using the metadata, a first training set comprising records of at least a first chunk is identified from a plurality of chunks of a data set. The first training set is used to train a machine learning model during a first training-and-evaluation iteration. A first test set comprising records of at least a second chunk is identified using the metadata, and is used to evaluate the model during the first training-and-evaluation iteration.

    Pattern-based detection using data injection

    公开(公告)号:US11327953B2

    公开(公告)日:2022-05-10

    申请号:US16692100

    申请日:2019-11-22

    Abstract: Pattern based detection of data usage is facilitated using data injection. Data values are injected in one or more storage locations accessible to a plurality of services or included in service requests. Service interactions among the services are compared to a set of patterns. The set of patterns are configured to match the data values. By comparing the service interactions to the patterns, one or more of the service interactions are determined to include individual ones of the data values. Data are generated indicating a presence of the data values in the services.

    PATTERN-BASED DETECTION USING DATA INJECTION

    公开(公告)号:US20200089669A1

    公开(公告)日:2020-03-19

    申请号:US16692100

    申请日:2019-11-22

    Abstract: Pattern based detection of data usage is facilitated using data injection. Data values are injected in one or more storage locations accessible to a plurality of services or included in service requests. Service interactions among the services are compared to a set of patterns. The set of patterns are configured to match the data values. By comparing the service interactions to the patterns, one or more of the service interactions are determined to include individual ones of the data values. Data are generated indicating a presence of the data values in the services.

    Consistent filtering of machine learning data

    公开(公告)号:US11544623B2

    公开(公告)日:2023-01-03

    申请号:US16591521

    申请日:2019-10-02

    Abstract: Consistency metadata, including a parameter for a pseudo-random number source, are determined for training-and-evaluation iterations of a machine learning model. Using the metadata, a first training set comprising records of at least a first chunk is identified from a plurality of chunks of a data set. The first training set is used to train a machine learning model during a first training-and-evaluation iteration. A first test set comprising records of at least a second chunk is identified using the metadata, and is used to evaluate the model during the first training-and-evaluation iteration.

    Input processing for machine learning

    公开(公告)号:US11100420B2

    公开(公告)日:2021-08-24

    申请号:US14460312

    申请日:2014-08-14

    Abstract: A record extraction request for a data set is received at a machine learning service. A plan to perform one or more chunk-level operations (such as sampling, shuffling, splitting or partitioning for parallel computation) on chunks of the data set is generated. A set of data transfers that results in a particular chunk being stored in a particular server's memory is initiated to implement the first chunk-level operation of the sequence. A second operation such as another filtering operation or a feature processing operation is performed on a result set of the first chunk-level operation.

    CONSISTENT FILTERING OF MACHINE LEARNING DATA

    公开(公告)号:US20230126005A1

    公开(公告)日:2023-04-27

    申请号:US18146075

    申请日:2022-12-23

    Abstract: Consistency metadata, including a parameter for a pseudo-random number source, are determined for training-and-evaluation iterations of a machine learning model. Using the metadata, a first training set comprising records of at least a first chunk is identified from a plurality of chunks of a data set. The first training set is used to train a machine learning model during a first training-and-evaluation iteration. A first test set comprising records of at least a second chunk is identified using the metadata, and is used to evaluate the model during the first training-and-evaluation iteration.

    CONSISTENT FILTERING OF MACHINE LEARNING DATA

    公开(公告)号:US20200034742A1

    公开(公告)日:2020-01-30

    申请号:US16591521

    申请日:2019-10-02

    Abstract: Consistency metadata, including a parameter for a pseudo-random number source, are determined for training-and-evaluation iterations of a machine learning model. Using the metadata, a first training set comprising records of at least a first chunk is identified from a plurality of chunks of a data set. The first training set is used to train a machine learning model during a first training-and-evaluation iteration. A first test set comprising records of at least a second chunk is identified using the metadata, and is used to evaluate the model during the first training-and-evaluation iteration.

Patent Agency Ranking