GRADIENT-BASED AUTO-TUNING FOR MACHINE LEARNING AND DEEP LEARNING MODELS

    公开(公告)号:US20220027746A1

    公开(公告)日:2022-01-27

    申请号:US17499945

    申请日:2021-10-13

    Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.

    ASSYMETRIC ALLOCATION OF SRAM AND DATA LAYOUT FOR EFFICIENT MATRIX MULTIPLICATION

    公开(公告)号:US20190095399A1

    公开(公告)日:2019-03-28

    申请号:US15716225

    申请日:2017-09-26

    Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.

    Mini-machine learning
    3.
    发明授权

    公开(公告)号:US11790242B2

    公开(公告)日:2023-10-17

    申请号:US16166039

    申请日:2018-10-19

    CPC classification number: G06N3/126 G06N20/00

    Abstract: Techniques are described for generating and applying mini-machine learning variants of machine learning algorithms to save computational resources in tuning and selection of machine learning algorithms. In an embodiment, at least one of the hyper-parameter values for a reference variant is modified to a new hyper-parameter value thereby generating a new variant of machine learning algorithm from the reference variant of machine learning algorithm. A performance score is determined for the new variant of machine learning algorithm using a training dataset, the performance score representing the accuracy of the new machine learning model for the training dataset. By performing training of the new variant of machine learning algorithm with the training data set, a cost metric of the new variant of machine learning algorithm is measured by measuring usage the used computing resources for the training. Based on the cost metric of the new variant of machine learning algorithm and comparing the performance score for the new and reference variants, the system determines whether the modified reference machine algorithm is the mini-machine learning algorithm that is computationally less costly than the reference variant of machine learning algorithm but closely tracks the accuracy thereof.

    ASYMMETRIC ALLOCATION OF SRAM AND DATA LAYOUT FOR EFFICIENT MATRIX-MATRIX MULTIPLICATION

    公开(公告)号:US20210312014A1

    公开(公告)日:2021-10-07

    申请号:US17349817

    申请日:2021-06-16

    Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.

    Scalable and efficient distributed auto-tuning of machine learning and deep learning models

    公开(公告)号:US11120368B2

    公开(公告)日:2021-09-14

    申请号:US16137719

    申请日:2018-09-21

    Abstract: Herein are techniques for automatic tuning of hyperparameters of machine learning algorithms. System throughput is maximized by horizontally scaling and asynchronously dispatching the configuration, training, and testing of an algorithm. In an embodiment, a computer stores a best cost achieved by executing a target model based on best values of the target algorithm's hyperparameters. The best values and their cost are updated by epochs that asynchronously execute. Each epoch has asynchronous costing tasks that explore a distinct hyperparameter. Each costing task has a sample of exploratory values that differs from the best values along the distinct hyperparameter. The asynchronous costing tasks of a same epoch have different values for the distinct hyperparameter, which accomplishes an exploration. In an embodiment, an excessive update of best values or best cost creates a major epoch for exploration in a subspace that is more or less unrelated to other epochs, thereby avoiding local optima.

    Gradient-based auto-tuning for machine learning and deep learning models

    公开(公告)号:US11176487B2

    公开(公告)日:2021-11-16

    申请号:US15885515

    申请日:2018-01-31

    Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.

    Assymetric allocation of SRAM and data layout for efficient matrix multiplication

    公开(公告)号:US11138291B2

    公开(公告)日:2021-10-05

    申请号:US15716225

    申请日:2017-09-26

    Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.

    ADAPTIVE SAMPLING FOR IMBALANCE MITIGATION AND DATASET SIZE REDUCTION IN MACHINE LEARNING

    公开(公告)号:US20200342265A1

    公开(公告)日:2020-10-29

    申请号:US16718164

    申请日:2019-12-17

    Abstract: According to an embodiment, a method includes generating a first dataset sample from a dataset, calculating a first validation score for the first dataset sample and a machine learning model, and determining whether a difference in validation score between the first validation score and a second validation score satisfies a first criteria. If the difference in validation score does not satisfy the first criteria, the method includes generating a second dataset sample from the dataset. If the difference in validation score does satisfy the first criteria, the method includes updating a convergence value and determining whether the updated convergence value satisfies a second criteria. If the updated convergence value satisfies the second criteria, the method includes returning the first dataset sample. If the updated convergence value does not satisfy the second criteria, the method includes generating the second dataset sample from the dataset.

    USING HYPERPARAMETER PREDICTORS TO IMPROVE ACCURACY OF AUTOMATIC MACHINE LEARNING MODEL SELECTION

    公开(公告)号:US20200334569A1

    公开(公告)日:2020-10-22

    申请号:US16388830

    申请日:2019-04-18

    Abstract: Techniques are provided for selection of machine learning algorithms based on performance predictions by using hyperparameter predictors. In an embodiment, for each mini-machine learning model (MML model) of a plurality of MML models, a respective hyperparameter predictor set that predicts a respective set of hyperparameter settings for a first data set is trained. Each MML model represents a respective reference machine learning model (RML model) of a plurality of RML models. A first plurality of data set samples is generated from the first data set. A first plurality of first meta-feature sets is generated, each first meta-feature set describing a respective first data set sample of said first plurality. A respective target set of hyperparameter settings are generated for said each MML model using a hypertuning algorithm. The first plurality of first meta-feature sets and the respective target set of hyperparameter settings are used to train the respective hyperparameter predictor set. Each hyperparameter predictor set is used during training and inference to improve the accuracy of automatically selecting a RML model per data set.

Patent Agency Ranking