SHUFFLING-TYPE GRADIENT METHOD FOR TRAINING MACHINE LEARNING MODELS WITH BIG DATA
摘要:
A computer-implemented method for a shuffling-type gradient for training a machine learning model using a stochastic gradient descent (SGD) includes the operations of uniformly randomly distributing data samples or coordinate updates of a training data, and calculating the learning rates for a no-shuffling scheme and a shuffling scheme. A combined operation of the no-shuffling scheme and the shuffling scheme of the training data is performed using a stochastic gradient descent (SGD) algorithm. The combined operation is switched to performing only the shuffling scheme from the no-shuffling scheme based on one or more predetermined criterion; and training the machine learning models with the training data based on the combined no-shuffling scheme and shuffling scheme.
信息查询
0/0