-
公开(公告)号:US11842045B2
公开(公告)日:2023-12-12
申请号:US17823545
申请日:2022-08-31
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/04886 , G06F3/16 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F40/166 , G06F40/289 , G10L15/22
CPC classification number: G06F3/04886 , G06F1/1626 , G06F3/0233 , G06F3/04883 , G06F3/167 , G06F40/166 , G06F40/289 , G06F2203/0381 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
公开(公告)号:US20220308975A1
公开(公告)日:2022-09-29
申请号:US17717822
申请日:2022-04-11
Applicant: GOOGLE LLC
Inventor: Dragan Zivkovic , Harry Bleyan , Tamar Lucassen , Akash Agrawal
Abstract: Implementations disclosed herein are directed to systems and methods for evaluating new feature(s) for client device(s) based on performance measure(s) of the client device(s) and/or the new feature(s). The new feature(s) can include, for example, machine learning (ML) model(s), non-ML software-enabled functionality, non-ML hardware-enabled functionality, and/or ML or non-ML software application features for a given software application utilized by the client device(s). The client device(s) can generate the performance measure(s) by processing a plurality of testing instances for the new feature(s). The performance measure(s) can include, for example, latency measure(s), memory consumption measure(s), CPU usage measure(s), precision and/or recall measure(s), and/or other measures. In some implementations, the new feature(s) may be activated for use locally at the client device(s) based on the performance measure(s), and optionally at other client device(s) that share the same device characteristics. In other implementations, the new feature(s) may be modified based on the performance measure(s).
-
13.
公开(公告)号:US20210327421A1
公开(公告)日:2021-10-21
申请号:US16973572
申请日:2019-11-08
Applicant: Google LLC
Inventor: Françoise Beaufays , Rajiv Mathews , Dragan Zivkovic , Kurt Partridge , Andrew Hard
IPC: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30
Abstract: Processor(s) of a client device can: receive sensor data that captures environmental attributes of an environment of the client device; process the sensor data using a machine learning model to generate a predicted output that dictates whether one or more currently dormant automated assistant functions are activated; making a decision as to whether to trigger the one or more currently dormant automated assistant functions; subsequent to making the decision, determining that the decision was incorrect; and in response to determining that the determination was incorrect, generating a gradient based on comparing the predicted output to ground truth output. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.
-
公开(公告)号:US20200371686A1
公开(公告)日:2020-11-26
申请号:US16989420
申请日:2020-08-10
Applicant: Google LLC
Inventor: Ouais Alsharif , Peter Ciccotto , Francoise Beaufays , Dragan Zivkovic
IPC: G06F3/0488 , G06F3/023 , G06F40/263 , G06F40/274
Abstract: A keyboard is described that determines, using a first decoder and based on a selection of keys of a graphical keyboard, text. Responsive to determining that a characteristic of the text satisfies a threshold, a model of the keyboard identifies the target language of the text, and determines whether the target language is different than a language associated with the first decoder. If the target language of the text is not different than the language associated with the first decoder, the keyboard outputs, for display, an indication of first candidate words determined by the first decoder from the text. If the target language of the text is different: the keyboard enables a second decoder, where a language associated with the second decoder matches the target language of the text, and outputs, for display, an indication of second candidate words determined by the second decoder from the text.
-
公开(公告)号:US10831366B2
公开(公告)日:2020-11-10
申请号:US15393676
申请日:2016-12-29
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/0488 , G06F3/16 , G06F1/16 , G06F3/023 , G06F40/166 , G06F40/289 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
公开(公告)号:US12223952B2
公开(公告)日:2025-02-11
申请号:US17959637
申请日:2022-10-04
Applicant: GOOGLE LLC
Inventor: Rajiv Mathews , Dragan Zivkovic , Khe Chai Sim
Abstract: On-device processor(s) of a client device may store, in on-device storage and in association with a time to live (TTL) in the on-device storage, a correction directed to ASR processing of audio data. The correction may include a portion of a given speech hypothesis that was modified to an alternate speech hypothesis. Further, the on-device processor(s) may cause an on-device ASR model to be personalized based on the correction. Moreover, and based on additional ASR processing of additional audio data, the on-device processor(s) may store, in the on-device storage and in association with an additional TTL in the on-device storage, a pseudo-correction directed to the additional ASR processing. Accordingly, the on-device processor(s) may cause the on-device ASR model to be personalized based on the pseudo-correction to prevent forgetting by the on-device ASR model.
-
17.
公开(公告)号:US12014739B2
公开(公告)日:2024-06-18
申请号:US18218818
申请日:2023-07-06
Applicant: GOOGLE LLC
Inventor: Françoise Beaufays , Rajiv Mathews , Dragan Zivkovic , Kurt Partridge , Andrew Hard
IPC: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30
CPC classification number: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30
Abstract: Processor(s) of a client device can: receive sensor data that captures environmental attributes of an environment of the client device; process the sensor data using a machine learning model to generate a predicted output that dictates whether one or more currently dormant automated assistant functions are activated; making a decision as to whether to trigger the one or more currently dormant automated assistant functions; subsequent to making the decision, determining that the decision was incorrect; and in response to determining that the determination was incorrect, generating a gradient based on comparing the predicted output to ground truth output. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.
-
公开(公告)号:US20240086063A1
公开(公告)日:2024-03-14
申请号:US18517825
申请日:2023-11-22
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/04886 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F3/16 , G06F40/166 , G06F40/289
CPC classification number: G06F3/04886 , G06F1/1626 , G06F3/0233 , G06F3/04883 , G06F3/167 , G06F40/166 , G06F40/289 , G06F2203/0381 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
19.
公开(公告)号:US11741953B2
公开(公告)日:2023-08-29
申请号:US16973572
申请日:2019-11-08
Applicant: Google LLC
Inventor: Françoise Beaufays , Rajiv Mathews , Dragan Zivkovic , Kurt Partridge , Andrew Hard
IPC: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30
CPC classification number: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30
Abstract: Processor(s) of a client device can: receive sensor data that captures environmental attributes of an environment of the client device; process the sensor data using a machine learning model to generate a predicted output that dictates whether one or more currently dormant automated assistant functions are activated; making a decision as to whether to trigger the one or more currently dormant automated assistant functions; subsequent to making the decision, determining that the decision was incorrect; and in response to determining that the determination was incorrect, generating a gradient based on comparing the predicted output to ground truth output. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.
-
公开(公告)号:US11435898B2
公开(公告)日:2022-09-06
申请号:US17064173
申请日:2020-10-06
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/04886 , G06F3/16 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F40/166 , G06F40/289 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
-
-
-
-
-
-
-
-