-
公开(公告)号:EP4405941A1
公开(公告)日:2024-07-31
申请号:EP22789441.7
申请日:2022-09-22
申请人: Google LLC
发明人: LI, Bo , SAINATH, Tara N , PANG, Ruoming , CHANG, Shuo-yiin , XU, Qiumin , STROHMAN, Trevor , CHEN, Vince , LIANG, Qiao , LIU, Heguang , HE, Yanzhang , HAGHANI, Parisa , BIDICHANDANI, Sameer
IPC分类号: G10L15/16
CPC分类号: G10L15/16
-
公开(公告)号:EP3776530A1
公开(公告)日:2021-02-17
申请号:EP19728819.4
申请日:2019-05-17
申请人: Google LLC
发明人: JIA, Ye , CHEN, Zhifeng , WU, Yonghui , SHEN, Jonathan , PANG, Ruoming , WEISS, Ron J. , MORENO, Ignacio Lopez , REN, Fei , ZHANG, Yu , WANG, Quan , NGUYEN, Patrick An Phu
IPC分类号: G10L13/033 , G10L13/04 , G10L25/30
-
公开(公告)号:EP4434007A1
公开(公告)日:2024-09-25
申请号:EP22859489.1
申请日:2022-12-09
申请人: GOOGLE LLC
发明人: ZHANG, Bowen , YU, Jiahui , FIFTY, Christopher , HAN, Wei , DAI, Andrew M. , PANG, Ruoming , SHA, Fei
-
公开(公告)号:EP4307299A3
公开(公告)日:2024-05-01
申请号:EP23213490.8
申请日:2021-03-17
申请人: Google LLC
IPC分类号: G10L15/26 , G10L25/87 , G10L15/04 , G10L15/16 , G10L15/06 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/08
CPC分类号: G10L15/26 , G10L25/87 , G10L15/04 , G10L15/16 , G10L15/063 , G06N3/08 , G06N3/044 , G06N3/045
摘要: A method (400) includes receiving a training example (302) that includes audio data (202) representing a spoken utterance (12) and a ground truth transcription (204). For each word (310) in the utterance, the method also includes inserting a placeholder symbol before the word identifying a respective ground truth alignment (312, 314) for a beginning and an end of the word, and generating a first constrained alignment (330) for a beginning word piece and a second constrained alignment for an ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder (230) by applying the first and second constrained alignments.
-
公开(公告)号:EP4218007A1
公开(公告)日:2023-08-02
申请号:EP21787149.0
申请日:2021-09-09
申请人: Google LLC
发明人: YU, Jiahui , CHIU, Chung-Cheng , LI, Bo , CHANG, Shuo-Yiin , SAINATH, Tara, N. , HAN, Wei , GULATI, Anmol , HE, Yanzhang , NARAYANAN, Arun , WU, Yonghui , PANG, Ruoming
-
-
公开(公告)号:EP4414896A2
公开(公告)日:2024-08-14
申请号:EP24184116.2
申请日:2021-01-14
申请人: GOOGLE LLC
IPC分类号: G06N3/088
CPC分类号: G10L15/16 , G10L2015/08120130101 , G10L2015/08520130101 , G10L15/02 , G10L15/32 , G06N3/088 , G06N3/084 , G06N3/044
摘要: Disclosed herein is a computer-implemented method when executed on data processing hardware causes the data processing hardware to perform operations comprising:
receiving a sequence of audio features characterizing an utterance;
based on the sequence of audio features, generating, using a first-pass decoder model, a plurality of first-pass speech recognition hypotheses, each first-pass speech recognition hypothesis corresponding to a candidate transcription of the utterance;
generating, using a long short-term memory (LSTM) encoder, a first-pass encoding of the plurality of first-pass speech recognition hypotheses; and
based on the sequence of audio features and the first-pass encoding, generating, using a second-pass decoder model, a second-pass hypothesis that rescores the plurality of first-pass speech recognition hypotheses.-
公开(公告)号:EP4409570A1
公开(公告)日:2024-08-07
申请号:EP22797590.1
申请日:2022-09-29
申请人: Google LLC
发明人: SAINATH, Tara N , BOTROS, Rami , GULATI, Anmol , CHOROMANSKI, Krzysztof , STROHMAN, Trevor , YU, Jiahua , WANG, Weiran , PANG, Ruoming
CPC分类号: G10L15/16 , G06N3/0455 , G06N3/0442 , G06N3/0464 , G06N3/047 , G06N3/09
-
公开(公告)号:EP4323928A1
公开(公告)日:2024-02-21
申请号:EP22787100.1
申请日:2022-09-21
申请人: Google LLC
-
公开(公告)号:EP4295355A1
公开(公告)日:2023-12-27
申请号:EP21729721.7
申请日:2021-05-11
申请人: Google LLC
-
-
-
-
-
-
-
-
-