-
公开(公告)号:US10319069B2
公开(公告)日:2019-06-11
申请号:US15842615
申请日:2017-12-14
Applicant: International Business Machines Corporation
Inventor: Shiyu Chang , Liana L. Fong , Wei Tan
Abstract: Techniques that facilitate matrix factorization associated with graphics processing units are provided. In one example, a computer-implemented method is provided. The computer-implemented method can comprise loading, by a graphics processing unit operatively coupled to a processor, item features from a data matrix into a shared memory. The data matrix can be a matrix based on one or more user features and item features. The computer-implemented method can further comprise tiling and aggregating, by the graphics processing unit, outer products of the data matrix tiles to generate an aggregate value and approximating, by the graphics processing unit, an update to a user feature of the data matrix based on the aggregate value and the loaded item features.
-
公开(公告)号:US11854562B2
公开(公告)日:2023-12-26
申请号:US16411614
申请日:2019-05-14
Applicant: International Business Machines Corporation
Inventor: Yang Zhang , Shiyu Chang
IPC: G10L21/003 , G10L21/013 , G10L19/00 , G06N20/20 , G06N3/08 , G06N3/045
CPC classification number: G10L21/013 , G06N3/045 , G06N3/08 , G06N20/20 , G10L19/00 , G10L2021/0135
Abstract: A method (and structure and computer product) to permit zero-shot voice conversion with non-parallel data includes receiving source speaker speech data as input data into a content encoder of a style transfer autoencoder system, the content encoder providing a source speaker disentanglement of the source speaker speech data by reducing speaker style information of the input source speech data while retaining content information and receiving target speaker input speech as input data into a target speaker encoder. The output of the content encoder and the target speaker encoder are combined in a decoder of the style transfer autoencoder, and the output of the decoder provides the content information of the input source speech data in a style of the target speaker speech information.
-
公开(公告)号:US11551000B2
公开(公告)日:2023-01-10
申请号:US16658120
申请日:2019-10-20
Inventor: Shiyu Chang , Mo Yu , Yang Zhang , Tommi S. Jaakkola
Abstract: A method and system of training a natural language processing network are provided. A corpus of data is received and one or more input features selected therefrom by a generator network. The one or more selected input features from the generator network are received by a first predictor network and used to predict a first output label. A complement of the selected input features from the generator network are received by a second predictor network and used to predict a second output label.
-
公开(公告)号:US11295762B2
公开(公告)日:2022-04-05
申请号:US16852617
申请日:2020-04-20
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Kaizhi Qian , Yang Zhang , Shiyu Chang , Chuang Gan , David Cox
Abstract: A method, a structure, and a computer system for decomposing speech. The exemplary embodiments may include one or more encoders for generating one or more encodings of a speech input comprising rhythm information, pitch information, timbre information, and content information, and a decoder for decoding the one or more encodings.
-
公开(公告)号:US20220058345A1
公开(公告)日:2022-02-24
申请号:US16997494
申请日:2020-08-19
Applicant: International Business Machines Corporation
Inventor: Xiaoxiao Guo , Mo Yu , Yupeng Gao , Chuang Gan , Shiyu Chang , Murray Scott Campbell
IPC: G06F40/35 , G06N3/08 , G06F40/284 , G06F40/253 , G06F40/295
Abstract: A current observation expressed in natural language is received. Entities in the current observation are extracted. A relevant historical observation is retrieved, which has at least one of the entities in common with the current observation. The current observation and the relevant historical observation are combined as observations. The observations and a template list specifying a list of verb phrases to be filled-in with at least some of the entities are input to a neural network, which can output the template list of the verb phrases filled-in with said at least some of the entities. The neural network can include attention mechanism. A reward associated with the neural network's output can be received and fed back to the neural network for retraining the neural network.
-
公开(公告)号:US20210327460A1
公开(公告)日:2021-10-21
申请号:US16852617
申请日:2020-04-20
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Kaizhi Qian , Yang Zhang , Shiyu Chang , Chuang Gan , David Cox
IPC: G10L25/90
Abstract: A method, a structure, and a computer system for decomposing speech. The exemplary embodiments may include one or more encoders for generating one or more encodings of a speech input comprising rhythm information, pitch information, timbre information, and content information, and a decoder for decoding the one or more encodings.
-
公开(公告)号:US11080558B2
公开(公告)日:2021-08-03
申请号:US16360563
申请日:2019-03-21
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Shiyu Chang , Emrah Akin Sisbot , Norma Edith Sosa , Wang Zhou
Abstract: Methods and systems perform incremental learning object detection in images and/or videos without catastrophic forgetting of previously-learned object classes. A two-stage neural network object detector is trained to locate and identify objects pertaining to an additional object class by iteratively updating the two-stage neural network object detector until an overall detection accuracy criterion is met. The updating is performed so as to balance minimizing a loss of an initial ability to locate and identify objects pertaining to the previously-learned object classes and maximizing an ability to additionally locate and identify the objects pertaining to the additional object class. Assessing whether the overall detection accuracy criterion is met compares outputs of an initial version of the two-stage neural network object detector with a current region proposal output by a current version of the two-stage neural network object detector to determining a region proposal distillation loss and a previously-learned-object identification distillation loss.
-
-
-
-
-
-