-
公开(公告)号:US11961976B2
公开(公告)日:2024-04-16
申请号:US17194480
申请日:2021-03-08
Applicant: Google LLC
Inventor: James Robert Lim , Wei Li , Brian Conner , Brett Wilson
CPC classification number: H01M10/425 , H01M10/482 , H01M10/486 , H01M2010/4271
Abstract: An example outdoor mounted device includes a first battery configured to operate at a low temperature range that at least includes negative 20 Celsius; a second battery configured to operate at a high temperature range; a temperature sensor; and processing circuitry configured to: determine, based on data received from the temperature sensors, a current temperature; responsive to determining that the current temperature is within the low temperature range, cause one or more components of the computing device to operate using electrical energy sourced from the first battery; and responsive to determining that the current temperature is within the high temperature range, cause the one or more components of the computing device to operate using electrical energy sourced from the second battery.
-
公开(公告)号:US20220285741A1
公开(公告)日:2022-09-08
申请号:US17194480
申请日:2021-03-08
Applicant: Google LLC
Inventor: James Robert Lim , Wei Li , Brian Conner , Brett Wilson
Abstract: An example outdoor mounted device includes a first battery configured to operate at a low temperature range that at least includes negative 20 Celsius; a second battery configured to operate at a high temperature range; a temperature sensor; and processing circuitry configured to: determine, based on data received from the temperature sensors, a current temperature; responsive to determining that the current temperature is within the low temperature range, cause one or more components of the computing device to operate using electrical energy sourced from the first battery; and responsive to determining that the current temperature is within the high temperature range, cause the one or more components of the computing device to operate using electrical energy sourced from the second battery.
-
公开(公告)号:US20220238101A1
公开(公告)日:2022-07-28
申请号:US17616135
申请日:2020-12-03
Applicant: GOOGLE LLC
Inventor: Tara N. Sainath , Yanzhang He , Bo Li , Arun Narayanan , Ruoming Pang , Antoine Jean Bruguier , Shuo-yiin Chang , Wei Li
Abstract: Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.
-
公开(公告)号:US11295739B2
公开(公告)日:2022-04-05
申请号:US16527487
申请日:2019-07-31
Applicant: Google LLC
Inventor: Wei Li , Rohit Prakash Prabhavalkar , Kanury Kanishka Rao , Yanzhang He , Ian C. McGraw , Anton Bakhtin
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.
-
公开(公告)号:US20200066271A1
公开(公告)日:2020-02-27
申请号:US16527487
申请日:2019-07-31
Applicant: Google LLC
Inventor: Wei Li , Rohit Prakash Prabhavalkar , Kanury Kanishka Rao , Yanzhang He , Ian C. McGraw , Anton Bakhtin
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.
-
公开(公告)号:US20240338234A1
公开(公告)日:2024-10-10
申请号:US18579756
申请日:2022-09-06
Applicant: Google LLC
Inventor: Wei Li
IPC: G06F9/451
CPC classification number: G06F9/453
Abstract: Provided is a framework to reliably build agents capable of user interface (UI) navigation. For example, example implementations create UI navigation agents with the power of neural networks that learn from human demonstrations.
-
公开(公告)号:US12073824B2
公开(公告)日:2024-08-27
申请号:US17616135
申请日:2020-12-03
Applicant: GOOGLE LLC
Inventor: Tara N. Sainath , Yanzhang He , Bo Li , Arun Narayanan , Ruoming Pang , Antoine Jean Bruguier , Shuo-Yiin Chang , Wei Li
CPC classification number: G10L15/16 , G06N3/08 , G10L15/05 , G10L15/063 , G10L15/22 , G10L2015/0635
Abstract: Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.
-
公开(公告)号:US20240221750A1
公开(公告)日:2024-07-04
申请号:US18610233
申请日:2024-03-19
Applicant: Google LLC
Inventor: Wei Li , Rohit Prakash Prabhavalkar , Kanury Kanishka Rao , Yanzhang He , Ian C. McGraw , Anton Bakhtin
CPC classification number: G10L15/22 , G10L15/02 , G10L15/063 , G10L15/18 , G10L19/00 , G10L2015/025 , G10L2015/088 , G10L15/142 , G10L2015/223
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.
-
公开(公告)号:US20220270597A1
公开(公告)日:2022-08-25
申请号:US17182592
申请日:2021-02-23
Applicant: Google LLC
Inventor: David Qiu , Qiujia Li , Yanzhang He , Yu Zhang , Bo Li , Liangliang Cao , Rohit Prabhavalkar , Deepti Bhatia , Wei Li , Ke Hu , Tara Sainath , Ian Mcgraw
Abstract: A method includes receiving a speech recognition result, and using a confidence estimation module (CEM), for each sub-word unit in a sequence of hypothesized sub-word units for the speech recognition result: obtaining a respective confidence embedding that represents a set of confidence features; generating, using a first attention mechanism, a confidence feature vector; generating, using a second attention mechanism, an acoustic context vector; and generating, as output from an output layer of the CEM, a respective confidence output score for each corresponding sub-word unit based on the confidence feature vector and the acoustic feature vector received as input by the output layer of the CEM. For each of the one or more words formed by the sequence of hypothesized sub-word units, the method also includes determining a respective word-level confidence score for the word. The method also includes determining an utterance-level confidence score by aggregating the word-level confidence scores.
-
公开(公告)号:US20240420687A1
公开(公告)日:2024-12-19
申请号:US18815537
申请日:2024-08-26
Applicant: GOOGLE LLC
Inventor: Tara N. Sainath , Yanzhang He , Bo Li , Arun Narayanan , Ruoming Pang , Antoine Jean Bruguier , Shuo-yiin Chang , Wei Li
Abstract: Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.
-
-
-
-
-
-
-
-
-