-
公开(公告)号:US20240086063A1
公开(公告)日:2024-03-14
申请号:US18517825
申请日:2023-11-22
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/04886 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F3/16 , G06F40/166 , G06F40/289
CPC classification number: G06F3/04886 , G06F1/1626 , G06F3/0233 , G06F3/04883 , G06F3/167 , G06F40/166 , G06F40/289 , G06F2203/0381 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
公开(公告)号:US20230289134A1
公开(公告)日:2023-09-14
申请号:US18148582
申请日:2022-12-30
Applicant: GOOGLE LLC
Inventor: Archana Kannan , Roza Chojnacka , Jamieson Kerns , Xiyang Luo , Meltem Oktem , Nada Elassal
IPC: G06F3/16 , G06T7/73 , G06F3/01 , G06T7/20 , G06T11/20 , G06V30/262 , G06V40/20 , G06F18/214 , G06V30/19 , G06V10/82 , G06V20/20 , G06V40/10
CPC classification number: G06F3/167 , G06F3/017 , G06F3/16 , G06F18/214 , G06T7/20 , G06T7/73 , G06T11/20 , G06V10/82 , G06V20/20 , G06V30/19173 , G06V30/274 , G06V40/113 , G06V40/28 , G06T2210/12
Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.
-
公开(公告)号:US11543888B2
公开(公告)日:2023-01-03
申请号:US16946532
申请日:2020-06-25
Applicant: GOOGLE LLC
Inventor: Archana Kannan , Roza Chojnacka , Jamieson Kerns , Xiyang Luo , Meltem Oktem , Nada Elassal
IPC: G09G5/00 , G06F3/01 , G06T7/73 , G06F3/16 , G06K9/62 , G06T7/20 , G06T11/20 , G06V30/262 , G06V40/20
Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.
-
公开(公告)号:US11435898B2
公开(公告)日:2022-09-06
申请号:US17064173
申请日:2020-10-06
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/04886 , G06F3/16 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F40/166 , G06F40/289 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
公开(公告)号:US11238848B2
公开(公告)日:2022-02-01
申请号:US16709132
申请日:2019-12-10
Applicant: Google LLC
Inventor: Meltem Oktem , Taral Pradeep Joglekar , Fnu Heryandi , Pu-sen Chao , Ignacio Lopez Moreno , Salil Rajadhyaksha , Alexander H. Gruenstein , Diego Melendo Casado
IPC: G10L15/08 , G06F21/32 , G10L17/06 , G06F16/635 , G10L15/22 , G10L17/00 , G06K9/00 , G10L15/07 , G10L15/26
Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance, and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.
-
公开(公告)号:US20210019046A1
公开(公告)日:2021-01-21
申请号:US17064173
申请日:2020-10-06
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/0488 , G06F3/16 , G06F1/16 , G06F3/023 , G06F40/166 , G06F40/289
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
17.
公开(公告)号:US20200342668A1
公开(公告)日:2020-10-29
申请号:US16395505
申请日:2019-04-26
Applicant: GOOGLE LLC
Inventor: Roza Chojnacka , Meltem Oktem , Rajan Patel , Uday Idnani , Xiyang Luo
Abstract: A system and method may provide for spatial and semantic auto-completion of an augmented or mixed reality environment. The system may detect physical objects in a physical environment based on analysis of image frames captured by an image sensor of a computing device. The system may detect spaces in the physical environment that are occupied by the detected physical objects, and may detect spaces that are unoccupied in the physical environment. Based on the identification of the detected physical objects, the system may gain a semantic understanding of the physical environment, and may determine suggested objects for placement in the physical environment based on the semantic understanding. The system may place virtual representations of the suggested objects in a mixed reality scene of the physical environment for user consideration.
-
公开(公告)号:US20180308491A1
公开(公告)日:2018-10-25
申请号:US15956350
申请日:2018-04-18
Applicant: Google LLC
Inventor: Meltem Oktem , Taral Pradeep Joglekar , Fnu Heryandi , Pu-sen Chao , Ignacio Lopez Moreno , Salil Rajadhyaksha , Alexander H. Gruenstein , Diego Melendo Casado
Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The utterance is classified as spoken by a particular known user of the known users. A query that includes a representation of the utterance and an indication of the particular known user as the speaker is provided using the authentication token of the particular known user.
-
-
-
-
-
-
-