-
公开(公告)号:US20220130374A1
公开(公告)日:2022-04-28
申请号:US17572238
申请日:2022-01-10
Applicant: Google LLC
Inventor: Zhifeng Chen , Bo Li , Eugene Weinstein , Yonghui Wu , Pedro J. Moreno Mengibar , Ron J. Weiss , Khe Chai Sim , Tara N. Sainath , Patrick An Phu Nguyen
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.
-
公开(公告)号:US11900915B2
公开(公告)日:2024-02-13
申请号:US17572238
申请日:2022-01-10
Applicant: Google LLC
Inventor: Zhifeng Chen , Bo Li , Eugene Weinstein , Yonghui Wu , Pedro J. Moreno Mengibar , Ron J. Weiss , Khe Chai Sim , Tara N. Sainath , Patrick An Phu Nguyen
CPC classification number: G10L15/005 , G10L15/07 , G10L15/16 , G10L2015/0631
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.
-
公开(公告)号:US20220342632A1
公开(公告)日:2022-10-27
申请号:US17811793
申请日:2022-07-11
Applicant: Google LLC
Inventor: Eugene Weinstein , Ignacio L. Moreno
IPC: G06F3/16 , G06F9/451 , G06F40/109 , G06F3/04817
Abstract: Characteristics of a speaker are estimated using speech processing and machine learning. The characteristics of the speaker are used to automatically customize a user interface of a client device for the speaker.
-
公开(公告)号:US11238845B2
公开(公告)日:2022-02-01
申请号:US16684483
申请日:2019-11-14
Applicant: GOOGLE LLC
Inventor: Zhifeng Chen , Bo Li , Eugene Weinstein , Yonghui Wu , Pedro J. Moreno Mengibar , Ron J. Weiss , Khe Chai Sim , Tara N. Sainath , Patrick An Phu Nguyen
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.
-
公开(公告)号:US20200160836A1
公开(公告)日:2020-05-21
申请号:US16684483
申请日:2019-11-14
Applicant: GOOGLE LLC
Inventor: Zhifeng Chen , Bo Li , Eugene Weinstein , Yonghui Wu , Pedro J. Moreno Mengibar , Ron J. Weiss , Khe Chai Sim , Tara N. Sainath , Patrick An Phu Nguyen
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.
-
公开(公告)号:US09990925B2
公开(公告)日:2018-06-05
申请号:US15842019
申请日:2017-12-14
Applicant: Google LLC
Inventor: Eugene Weinstein , Pedro J. Moreno Mengibar
CPC classification number: G10L15/26 , G06F21/6254 , G06F2221/2111 , G10L15/06 , G10L15/063 , G10L15/28 , G10L2015/0636 , H04W4/02 , H04W4/12
Abstract: The present disclosure relates to training a speech recognition system. A system that includes an automated speech recognizer and receives data from a client device. The system determines that at least a portion of the received data is likely sensitive data. Before the at least a portion of the received data is deleted, the system provides the at least a portion of the received data to a model training engine that trains recognition models for the automated speech recognizer. After the at least a portion of the received data is provided, the system deletes the at least a portion of the received data.
-
公开(公告)号:US12254865B2
公开(公告)日:2025-03-18
申请号:US18418246
申请日:2024-01-20
Applicant: Google LLC
Inventor: Zhifeng Chen , Bo Li , Eugene Weinstein , Yonghui Wu , Pedro J. Moreno Mengibar , Ron J. Weiss , Khe Chai Sim , Tara N. Sainath , Patrick An Phu Nguyen
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.
-
公开(公告)号:US20240161732A1
公开(公告)日:2024-05-16
申请号:US18418246
申请日:2024-01-20
Applicant: Google LLC
Inventor: Zhifeng Chen , Bo Li , Eugene Weinstein , Yonghui Wu , Pedro J. Moreno Mengibar , Ron J. Weiss , Khe Chai Sim , Tara N. Sainath , Patrick An Phu Nguyen
CPC classification number: G10L15/005 , G10L15/07 , G10L15/16 , G10L2015/0631
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.
-
公开(公告)号:US20210117153A1
公开(公告)日:2021-04-22
申请号:US17136069
申请日:2020-12-29
Applicant: Google LLC
Inventor: Eugene Weinstein , Ignacio L. Moreno
IPC: G06F3/16 , G06F9/451 , G06F40/109 , G06F3/0481
Abstract: Characteristics of a speaker are estimated using speech processing and machine learning. The characteristics of the speaker are used to automatically customize a user interface of a client device for the speaker.
-
公开(公告)号:US11620104B2
公开(公告)日:2023-04-04
申请号:US17811793
申请日:2022-07-11
Applicant: Google LLC
Inventor: Eugene Weinstein , Ignacio L. Moreno
IPC: G06F3/16 , G06F9/451 , G06F40/109 , G06F3/04817
Abstract: Characteristics of a speaker are estimated using speech processing and machine learning. The characteristics of the speaker are used to automatically customize a user interface of a client device for the speaker.
-
-
-
-
-
-
-
-
-