-
公开(公告)号:US20220012584A1
公开(公告)日:2022-01-13
申请号:US16925178
申请日:2020-07-09
Applicant: International Business Machines Corporation
Inventor: Wei Zhang , Xiaodong Cui , Abdullah Kayi , Alper Buyuktosunoglu
Abstract: Embodiments of a method are disclosed. The method includes performing decentralized distributed deep learning training on a batch of training data. Additionally, the method includes determining a training time wherein the learner performs the decentralized distributed deep learning training on the batch of training data. Further, the method includes generating a table having the training time and other processing times for corresponding other learners performing the decentralized distributed deep learning training on corresponding other batches of other training data. The method also includes determining that the learner is a straggler based on the table and a threshold for the training time. Additionally, the method includes modifying a processing aspect of the straggler to reduce a future training time of the straggler for performing the decentralized distributed deep learning training on a new batch of training data in response to determining the learner is the straggler.
-
公开(公告)号:US11977986B2
公开(公告)日:2024-05-07
申请号:US16925161
申请日:2020-07-09
Applicant: International Business Machines Corporation
Inventor: Wei Zhang , Xiaodong Cui , Abdullah Kayi , Alper Buyuktosunoglu
CPC classification number: G06N3/098 , G06N3/045 , G06N3/08 , G06N5/043 , G06N20/00 , G06N20/20 , G05B2219/33151 , G06F18/214
Abstract: Embodiments of a method are disclosed. The method includes performing distributed deep learning training on multiple batches of training data using corresponding learners. Additionally, the method includes determining training times wherein the learners perform the distributed deep learning training on the batches of training data. The method also includes modifying a processing aspect of the straggler to reduce a future training time of the straggler for performing the distributed deep learning training on a new batch of training data in response to identifying a straggler of the learners by a centralized control.
-
公开(公告)号:US11886969B2
公开(公告)日:2024-01-30
申请号:US16925192
申请日:2020-07-09
Applicant: International Business Machines Corporation
Inventor: Wei Zhang , Xiaodong Cui , Abdullah Kayi , Alper Buyuktosunoglu
Abstract: Embodiments of a method are disclosed. The method includes performing distributed deep learning training on a batch of training data. The method also includes determining training times representing an amount of time between a beginning batch time and an end batch time. Further, the method includes modifying a communication aspect of the communication straggler to reduce a future network communication time for the communication straggler to send a future result of the distributed deep learning training on a new batch of training data in response to the centralized parameter server determining that the learner is the communication straggler.
-
14.
公开(公告)号:US20230186903A1
公开(公告)日:2023-06-15
申请号:US17549006
申请日:2021-12-13
Applicant: International Business Machines Corporation
Inventor: Xiaodong Cui , Brian E. D. Kingsbury , George Andrei Saon , David Haws , Zoltan Tueske
CPC classification number: G10L15/16 , G06N5/04 , G06N3/0454
Abstract: Mechanisms are provided for performing machine learning training of a computer model. A perturbation generator generates a modified training data comprising perturbations injected into original training data, where the perturbations cause a data corruption of the original training data. The modified training data is input into a prediction network of the computer model and processing the modified training data through the prediction network to generate a prediction output. Machine learning training is executed of the prediction network based on the prediction output and the original training data to generate a trained prediction network of a trained computer model. The trained computer model is deployed to an artificial intelligence computing system for performance of an inference operation.
-
15.
公开(公告)号:US11586919B2
公开(公告)日:2023-02-21
申请号:US16899681
申请日:2020-06-12
Applicant: International Business Machines Corporation
Inventor: Yada Zhu , Di Chen , Xiaodong Cui , Upendra Chitnis , Kumar Bhaskaran , Wei Zhang
Abstract: A task-based learning using task-directed prediction network can be provided. Training data can be received. Contextual information associated with a task-based criterion can be received. A machine learning model can be trained using the training data. A loss function computed during training of the machine learning model integrates the task-based criterion, and minimizing the loss function during training iterations includes minimizing the task-based criterion.
-
公开(公告)号:US20220358594A1
公开(公告)日:2022-11-10
申请号:US17315764
申请日:2021-05-10
Applicant: International Business Machines Corporation
Inventor: Yada Zhu , Wei Zhang , Xiaodong Cui , Guangnan Ye
Abstract: A machine learning model can be trained to predict one or more financial indicators using earnings call transcripts augmented with counterfactual information. Using faithful gradient-based method, prediction results with respect to a particular counterfactual information can be explained. Based on the explanation, the counterfactual information determined to have most impact on prediction results can be selected for updating the machine learning model.
-
公开(公告)号:US20220327374A1
公开(公告)日:2022-10-13
申请号:US17226399
申请日:2021-04-09
Applicant: International Business Machines Corporation
Inventor: Abdullah Kayi , Wei Zhang , Xiaodong Cui , Alper Buyuktosunoglu
Abstract: Computer hardware and/or software that performs the following operations: (i) updating a machine learning model by synchronously applying, to the machine learning model, a first set of training results received from a set of trainers having respective training datasets; (ii) receiving, from one or more trainers of the set of trainers, a first set of metrics pertaining to at least some of the training results of the first set of training results; and (iii) based, at least in part, on the first set of metrics, determining to subsequently update the machine learning model via asynchronous application of subsequent training results received from respective trainers of the set of trainers.
-
公开(公告)号:US11366874B2
公开(公告)日:2022-06-21
申请号:US16198945
申请日:2018-11-23
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Dennis Newns , Paul Solomon , Xiaodong Cui , Jin Ping Han , Xin Zhang
Abstract: Embodiments for implementing a softmax function in an analog circuit. The analog circuit may comprise a plurality of input nodes to accept voltage inputs; a plurality of diodes connected to each of the plurality of input nodes to perform a current adding function; a log amplifier coupled to the plurality of diodes; a plurality of analog adders coupled to the voltage inputs and an output of the log amplifier; and a plurality of exponential amplifiers, each of the plurality of exponential amplifiers coupled to one of the plurality of analog adders.
-
公开(公告)号:US20220012629A1
公开(公告)日:2022-01-13
申请号:US16925161
申请日:2020-07-09
Applicant: International Business Machines Corporation
Inventor: Wei Zhang , Xiaodong Cui , Abdullah Kayi , Alper Buyuktosunoglu
Abstract: Embodiments of a method are disclosed. The method includes performing distributed deep learning training on multiple batches of training data using corresponding learners. Additionally, the method includes determining training times wherein the learners perform the distributed deep learning training on the batches of training data. The method also includes modifying a processing aspect of the straggler to reduce a future training time of the straggler for performing the distributed deep learning training on a new batch of training data in response to identifying a straggler of the learners by a centralized control.
-
公开(公告)号:US10204620B2
公开(公告)日:2019-02-12
申请号:US15258799
申请日:2016-09-07
Applicant: International Business Machines Corporation
Inventor: Xiaodong Cui , Vaibhava Goel
Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and maximum likelihood linear regression (MLLR), transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a linear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.
-
-
-
-
-
-
-
-
-