ADAPTIVE SAMPLING FOR IMBALANCE MITIGATION AND DATASET SIZE REDUCTION IN MACHINE LEARNING

    公开(公告)号:US20200342265A1

    公开(公告)日:2020-10-29

    申请号:US16718164

    申请日:2019-12-17

    Abstract: According to an embodiment, a method includes generating a first dataset sample from a dataset, calculating a first validation score for the first dataset sample and a machine learning model, and determining whether a difference in validation score between the first validation score and a second validation score satisfies a first criteria. If the difference in validation score does not satisfy the first criteria, the method includes generating a second dataset sample from the dataset. If the difference in validation score does satisfy the first criteria, the method includes updating a convergence value and determining whether the updated convergence value satisfies a second criteria. If the updated convergence value satisfies the second criteria, the method includes returning the first dataset sample. If the updated convergence value does not satisfy the second criteria, the method includes generating the second dataset sample from the dataset.

    USING HYPERPARAMETER PREDICTORS TO IMPROVE ACCURACY OF AUTOMATIC MACHINE LEARNING MODEL SELECTION

    公开(公告)号:US20200334569A1

    公开(公告)日:2020-10-22

    申请号:US16388830

    申请日:2019-04-18

    Abstract: Techniques are provided for selection of machine learning algorithms based on performance predictions by using hyperparameter predictors. In an embodiment, for each mini-machine learning model (MML model) of a plurality of MML models, a respective hyperparameter predictor set that predicts a respective set of hyperparameter settings for a first data set is trained. Each MML model represents a respective reference machine learning model (RML model) of a plurality of RML models. A first plurality of data set samples is generated from the first data set. A first plurality of first meta-feature sets is generated, each first meta-feature set describing a respective first data set sample of said first plurality. A respective target set of hyperparameter settings are generated for said each MML model using a hypertuning algorithm. The first plurality of first meta-feature sets and the respective target set of hyperparameter settings are used to train the respective hyperparameter predictor set. Each hyperparameter predictor set is used during training and inference to improve the accuracy of automatically selecting a RML model per data set.

    Method for training multichannel data receiver timing

    公开(公告)号:US10983944B2

    公开(公告)日:2021-04-20

    申请号:US16251066

    申请日:2019-01-17

    Abstract: An apparatus includes a first device having a clock signal and configured to communicate, via a data bus, with a second device configured to assert a data strobe signal and a plurality of data bit signals on the data bus. The first device may include a control circuit configured, during a training phase, to determine relative timing between the clock signal, the plurality of data bit signals, and the data strobe signal. The first device may determine, using a first set of sampling operations, a first timing relationship of the plurality of data bit signals relative to the data strobe signal, and determine, using a second set of sampling operations, a second timing relationship of the plurality of data bit signals and the data strobe signal relative to the clock signal. During an operational phase, the control circuit may be configured to use delays based on the first and second timing relationships to sample data from the second device on the data bus.

    ADAPTIVE SAMPLING TO COMPUTE GLOBAL FEATURE EXPLANATIONS WITH SHAPLEY VALUES

    公开(公告)号:US20240086763A1

    公开(公告)日:2024-03-14

    申请号:US17944949

    申请日:2022-09-14

    CPC classification number: G06N20/00 G06N5/042

    Abstract: Techniques for computing global feature explanations using adaptive sampling are provided. In one technique, first and second samples from an dataset are identified. A first set of feature importance values (FIVs) is generated based on the first sample and a machine-learned model. A second set of FIVs is generated based on the second sample and the model. If a result of a comparison between the first and second FIV sets does not satisfy criteria, then: (i) an aggregated set is generated based on the last two FIV sets; (ii) a new sample that is double the size of a previous sample is identified from the dataset; (iii) a current FIV set is generated based on the new sample and the model; (iv) determine whether a result of a comparison between the current and aggregated FIV sets satisfies criteria; repeating (i)-(iv) until the result of the last comparison satisfies the criteria.

    METHOD FOR TRAINING MULTICHANNEL DATA RECEIVER TIMING

    公开(公告)号:US20210224221A1

    公开(公告)日:2021-07-22

    申请号:US17221580

    申请日:2021-04-02

    Abstract: An apparatus includes a first device having a clock signal and configured to communicate, via a data bus, with a second device configured to assert a data strobe signal and a plurality of data bit signals on the data bus. The first device may include a control circuit configured, during a training phase, to determine relative timing between the clock signal, the plurality of data bit signals, and the data strobe signal. The first device may determine, using a first set of sampling operations, a first timing relationship of the plurality of data bit signals relative to the data strobe signal, and determine, using a second set of sampling operations, a second timing relationship of the plurality of data bit signals and the data strobe signal relative to the clock signal. During an operational phase, the control circuit may be configured to use delays based on the first and second timing relationships to sample data from the second device on the data bus.

Patent Agency Ranking