-
1.
公开(公告)号:US11972754B2
公开(公告)日:2024-04-30
申请号:US17559617
申请日:2021-12-22
申请人: TENCENT AMERICA LLC
发明人: Jia Cui , Chao Weng , Guangsen Wang , Jun Wang , Chengzhu Yu , Dan Su , Dong Yu
CPC分类号: G10L15/063 , G10L15/10 , G10L25/03 , G10L25/54
摘要: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
-
2.
公开(公告)号:US11257481B2
公开(公告)日:2022-02-22
申请号:US16169512
申请日:2018-10-24
申请人: TENCENT AMERICA LLC
发明人: Jia Cui , Chao Weng , Guangsen Wang , Jun Wang , Chengzhu Yu , Dan Su , Dong Yu
摘要: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
-
公开(公告)号:US11803618B2
公开(公告)日:2023-10-31
申请号:US17989536
申请日:2022-11-17
申请人: TENCENT AMERICA LLC
发明人: Chao Weng , Jia Cui , Guangsen Wang , Jun Wang , Chengzhu Yu , Dan Su , Dong Yu
IPC分类号: G06F18/20 , G06V10/70 , G10L15/06 , G06N3/044 , G06N3/045 , G06N3/08 , G06F18/2415 , G06N20/00 , G06F40/47
CPC分类号: G06F18/24155 , G06F18/29 , G06F40/47 , G06N20/00 , G06V10/768 , G10L15/063
摘要: A method and apparatus are provided that analyzing sequence-to-sequence data, such as sequence-to-sequence speech data or sequence-to-sequence machine translation data for example, by minimum Bayes risk (MBR) training a sequence-to-sequence model and within introduction of applications of softmax smoothing to an N-best generation of the MBR training of the sequence-to-sequence model.
-
公开(公告)号:US10672382B2
公开(公告)日:2020-06-02
申请号:US16160352
申请日:2018-10-15
申请人: TENCENT AMERICA LLC
发明人: Chao Weng , Jia Cui , Guangsen Wang , Jun Wang , Chengzhu Yu , Dan Su , Dong Yu
IPC分类号: G10L15/06 , G10L15/14 , G10L15/183 , G10L15/22
摘要: Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.
-
公开(公告)号:US20200118547A1
公开(公告)日:2020-04-16
申请号:US16160352
申请日:2018-10-15
申请人: TENCENT AMERICA LLC
发明人: Chao WENG , Jia Cui , Guangsen WANG , Jun Wang , Chengzhu Yu , Dan Su , Dong Yu
摘要: Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.
-
公开(公告)号:US11551136B2
公开(公告)日:2023-01-10
申请号:US16191027
申请日:2018-11-14
申请人: TENCENT America LLC
发明人: Chao Weng , Jia Cui , Guangsen Wang , Jun Wang , Chengzhu Yu , Dan Su , Dong Yu
摘要: A method and apparatus are provided that analyzing sequence-to-sequence data, such as sequence-to-sequence speech data or sequence-to-sequence machine translation data for example, by minimum Bayes risk (MBR) training a sequence-to-sequence model and within introduction of applications of softmax smoothing to an N-best generation of the MBR training of the sequence-to-sequence model.
-
-
-
-
-