-
公开(公告)号:US11960793B2
公开(公告)日:2024-04-16
申请号:US18148582
申请日:2022-12-30
Applicant: GOOGLE LLC
Inventor: Archana Kannan , Roza Chojnacka , Jamieson Kerns , Xiyang Luo , Meltem Oktem , Nada Elassal
IPC: G09G5/00 , G06F3/01 , G06F3/16 , G06F18/214 , G06T7/20 , G06T7/73 , G06T11/20 , G06V10/82 , G06V20/20 , G06V30/19 , G06V30/262 , G06V40/10 , G06V40/20
CPC classification number: G06F3/167 , G06F3/017 , G06F3/16 , G06F18/214 , G06T7/20 , G06T7/73 , G06T11/20 , G06V10/82 , G06V20/20 , G06V30/19173 , G06V30/274 , G06V40/113 , G06V40/28 , G06T2210/12
Abstract: A method can perform a process with a method including capturing an image, determining an environment that a user is operating a computing device, detecting a hand gesture based on an object in the image, determining, using a machine learned model, an intent of a user based on the hand gesture and the environment, and executing a task based at least on the determined intent.
-
公开(公告)号:US20230335116A1
公开(公告)日:2023-10-19
申请号:US18210963
申请日:2023-06-16
Applicant: GOOGLE LLC
Inventor: Meltem Oktem , Taral Pradeep Joglekar , Fnu Heryandi , Pu-sen Chao , Ignacio Lopez Moreno , Salil Rajadhyaksha , Alexander H. Gruenstein , Diego Melendo Casado
CPC classification number: G10L15/08 , G06F16/636 , G06F21/32 , G06V40/10 , G10L15/07 , G10L15/22 , G10L17/00 , G10L17/06 , G10L2015/088 , G10L15/26
Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.
-
公开(公告)号:US20250028570A1
公开(公告)日:2025-01-23
申请号:US18909624
申请日:2024-10-08
Applicant: GOOGLE LLC
Inventor: Mark Sander Urbanus , Patrick Plunkett , Charlie Gengzao Wang , Meltem Oktem , James Carr
Abstract: A method including initiating a computing process on a wearable device, the computing process including a plurality of tasks, identifying a companion device and determining that the companion device is available to perform at least one task of the plurality of tasks, causing the companion device to perform the at least one task including communicating data generated by the wearable device to the companion device, receiving, by the wearable device, a result associated with a completion of the at least one task by the companion device, and completing, by the wearable device, the computing process based on the result associated with the completion of the at least one task.
-
公开(公告)号:US20220413696A1
公开(公告)日:2022-12-29
申请号:US17823545
申请日:2022-08-31
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/04886 , G06F3/16 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F40/166 , G06F40/289
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
公开(公告)号:US20220148577A1
公开(公告)日:2022-05-12
申请号:US17584866
申请日:2022-01-26
Applicant: GOOGLE LLC
Inventor: Meltem Oktem , Taral Pradeep Joglekar , Fnu Heryandi , Pu-sen Chao , Ignacio Lopez Moreno , Salil Rajadhyaksha , Alexander H. Gruenstein , Diego Melendo Casado
Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.
-
公开(公告)号:US10522137B2
公开(公告)日:2019-12-31
申请号:US15956350
申请日:2018-04-18
Applicant: Google LLC
Inventor: Meltem Oktem , Taral Pradeep Joglekar , Fnu Heryandi , Pu-sen Chao , Ignacio Lopez Moreno , Salil Rajadhyaksha , Alexander H. Gruenstein , Diego Melendo Casado
IPC: G10L15/08 , G06F21/32 , G10L17/06 , G06F16/635 , G06K9/00 , G10L15/07 , G10L17/00 , G10L15/22 , G10L15/26
Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The utterance is classified as spoken by a particular known user of the known users. A query that includes a representation of the utterance and an indication of the particular known user as the speaker is provided using the authentication token of the particular known user.
-
公开(公告)号:US11842045B2
公开(公告)日:2023-12-12
申请号:US17823545
申请日:2022-08-31
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/04886 , G06F3/16 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F40/166 , G06F40/289 , G10L15/22
CPC classification number: G06F3/04886 , G06F1/1626 , G06F3/0233 , G06F3/04883 , G06F3/167 , G06F40/166 , G06F40/289 , G06F2203/0381 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
公开(公告)号:US11721326B2
公开(公告)日:2023-08-08
申请号:US17584866
申请日:2022-01-26
Applicant: GOOGLE LLC
Inventor: Meltem Oktem , Taral Pradeep Joglekar , Fnu Heryandi , Pu-sen Chao , Ignacio Lopez Moreno , Salil Rajadhyaksha , Alexander H. Gruenstein , Diego Melendo Casado
IPC: G10L15/08 , G06F21/32 , G10L17/06 , G06F16/635 , G10L15/22 , G10L17/00 , G06V40/10 , G10L15/07 , G10L15/26
CPC classification number: G10L15/08 , G06F16/636 , G06F21/32 , G06V40/10 , G10L15/07 , G10L15/22 , G10L17/00 , G10L17/06 , G10L15/26 , G10L2015/088
Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker is not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.
-
9.
公开(公告)号:US11158128B2
公开(公告)日:2021-10-26
申请号:US16395505
申请日:2019-04-26
Applicant: GOOGLE LLC
Inventor: Roza Chojnacka , Meltem Oktem , Rajan Patel , Uday Idnani , Xiyang Luo
Abstract: A system and method may provide for spatial and semantic auto-completion of an augmented or mixed reality environment. The system may detect physical objects in a physical environment based on analysis of image frames captured by an image sensor of a computing device. The system may detect spaces in the physical environment that are occupied by the detected physical objects, and may detect spaces that are unoccupied in the physical environment. Based on the identification of the detected physical objects, the system may gain a semantic understanding of the physical environment, and may determine suggested objects for placement in the physical environment based on the semantic understanding. The system may place virtual representations of the suggested objects in a mixed reality scene of the physical environment for user consideration.
-
公开(公告)号:US10831366B2
公开(公告)日:2020-11-10
申请号:US15393676
申请日:2016-12-29
Applicant: Google LLC
Inventor: Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem
IPC: G06F3/0488 , G06F3/16 , G06F1/16 , G06F3/023 , G06F40/166 , G06F40/289 , G10L15/22
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
-
-
-
-
-
-
-
-
-