-
公开(公告)号:US20220188321A1
公开(公告)日:2022-06-16
申请号:US17676615
申请日:2022-02-21
Applicant: Google LLC
Inventor: Matthew Sharifi , David Petrou , Abhanshu Sharma
IPC: G06F16/2457 , G06F16/583 , G06F16/58 , G06F16/2452 , G06F16/903
Abstract: Methods, systems, and apparatus for receiving a query image, receiving one or more entities that are associated with the query image, identifying, for one or more of the entities, one or more candidate search queries that are pre-associated with the one or more entities, generating a respective relevance score for each of the candidate search queries, selecting, as a representative search query for the query image, a particular candidate search query based at least on the generated respective relevance scores and providing the representative search query for output in response to receiving the query image.
-
公开(公告)号:US20220180866A1
公开(公告)日:2022-06-09
申请号:US17111467
申请日:2020-12-03
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: A method for decaying speech processing includes receiving, at a voice-enabled device, an indication of a microphone trigger event indicating a possible interaction with the device through speech where the device has a microphone that, when open, is configured to capture speech for speech recognition. In response to receiving the indication of the microphone trigger event, the method also includes instructing the microphone to open or remain open for a duration window to capture an audio stream in an environment of the device and providing the audio stream captured by the open microphone to a speech recognition system. During the duration window, the method further includes decaying a level of the speech recognition processing based on a function of the duration window and instructing the speech recognition system to use the decayed level of speech recognition processing over the audio stream captured by the open microphone.
-
公开(公告)号:US11341954B2
公开(公告)日:2022-05-24
申请号:US16717518
申请日:2019-12-17
Applicant: Google LLC
Inventor: Matthew Sharifi , Kevin Kilgour , Dominik Roblek , James Lin
Abstract: A method of training a custom hotword model includes receiving a first set of training audio samples. The method also includes generating, using a speech embedding model configured to receive the first set of training audio samples as input, a corresponding hotword embedding representative of a custom hotword for each training audio sample of the first set of training audio samples. The speech embedding model is pre-trained on a different set of training audio samples with a greater number of training audio samples than the first set of training audio samples. The method further includes training the custom hotword model to detect a presence of the custom hotword in audio data. The custom hotword model is configured to receive, as input, each corresponding hotword embedding and to classify, as output, each corresponding hotword embedding as corresponding to the custom hotword.
-
公开(公告)号:US20220157318A1
公开(公告)日:2022-05-19
申请号:US17098013
申请日:2020-11-13
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: Implementations are directed to dynamically adapting which assistant on-device model(s) are locally stored at assistant devices of an assistant device group and/or dynamically adapting the assistant processing role(s) of the assistant device(s) of the assistant device group. In some of those implementations, the corresponding on-device model(s) and/or corresponding processing role(s), for each of the assistant devices of the group, is determined based on collectively considering individual processing capabilities of the assistant devices of the group. Implementations are additionally or alternatively directed to cooperatively utilizing assistant devices of a group, and their associated post-adaptation on-device model(s) and/or post-adaptation processing role(s), in cooperatively processing assistant requests that are directed to any one of the assistant devices of the group.
-
公开(公告)号:US20220157312A1
公开(公告)日:2022-05-19
申请号:US17650173
申请日:2022-02-07
Applicant: Google LLC
Inventor: Alexander H. Gruenstein , Johan Schalkwyk , Matthew Sharifi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.
-
公开(公告)号:US20220120573A1
公开(公告)日:2022-04-21
申请号:US17422664
申请日:2019-06-25
Applicant: Matthew SHARIFI , Aleksandar KRACUN , Google LLC
Inventor: Matthew Sharifi , Aleksandar Kracun
Abstract: The present disclosure is directed to a system and method for providing dynamic grouping and regrouping for users in a joint positional tracking session. The method includes receiving positional data associated with a first user and at least one other user in the plurality of users in the joint positional tracking session. The method includes determining that a separation parameter associated with the first user has exceeded a threshold separation value, the separation parameter associated with the first user representing a distance between the first user and one other user in the plurality of users. The method includes automatically generating navigational data for reducing the separation parameter between the first user and one other user in the joint positional tracking session to below the threshold separation value. The method includes transmitting the navigational data to at least the first user in the joint positional tracking session.
-
公开(公告)号:US20220115011A1
公开(公告)日:2022-04-14
申请号:US17081645
申请日:2020-10-27
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: Techniques are described herein for identifying a failed hotword attempt. A method includes: receiving first audio data; processing the first audio data to generate a first predicted output; determining that the first predicted output satisfies a secondary threshold but does not satisfy a primary threshold; receiving second audio data; processing the second audio data to generate a second predicted output; determining that the second predicted output satisfies the secondary threshold but does not satisfy the primary threshold; in response to the first predicted output and the second predicted output satisfying the secondary threshold but not satisfying the primary threshold, and in response to the first spoken utterance and the second spoken utterance satisfying one or more temporal criteria relative to one another, identifying a failed hotword attempt; and in response to identifying the failed hotword attempt, providing a hint that is responsive to the failed hotword attempt.
-
公开(公告)号:US11257498B2
公开(公告)日:2022-02-22
申请号:US17100109
申请日:2020-11-20
Applicant: Google LLC
Inventor: Alexander H. Gruenstein , Johan Schalkwyk , Matthew Sharifi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.
-
69.
公开(公告)号:US11256472B2
公开(公告)日:2022-02-22
申请号:US17010694
申请日:2020-09-02
Applicant: Google LLC
Inventor: Dominik Roblek , Blaise Hilary Aguera-Arcas , Thomas W. Hume , Marvin Karl Ritter , Brandon Charles Barbello , Kevin I. Kilgour , Mihajlo Velimirović , Christopher Thornton , Gabriel Oak Taubman , James David Lyon , Jan Heinrich Althaus , Katsiaryna Naliuka , Julian James Odell , Matthew Sharifi , Beat Gfeller
IPC: G06F3/16 , G06F16/635 , G06F16/683 , G06N3/08 , G06N20/00
Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products. A computing device stores reference song characterization data and receives digital audio data. The computing device determines whether the digital audio data represents music and then performs a different process to recognize that the digital audio data represents a particular reference song. The computing device then outputs an indication of the particular reference song.
-
公开(公告)号:US20220020365A1
公开(公告)日:2022-01-20
申请号:US16947030
申请日:2020-07-15
Applicant: Google LLC
Inventor: Victor Carbune , Matthew Sharifi
IPC: G10L15/22 , G10L15/26 , G10L15/183 , G10L15/16 , G06F3/16
Abstract: User interaction may be supported with an audio presentation by an automated assistant, and in particular with the spoken content of such an audio presentation that is presented at particular points within the audio presentation. Analysis of an audio presentation may be performed to identify one or more entities addressed by, mentioned by, or otherwise associated with the audio presentation, and utterance classification may be performed to determine whether an utterance received during playback of the audio presentation is directed to the audio presentation, and in some instances, to a particular entity and/or point of playback in the audio presentation, thereby enabling a suitable response to be generated to the utterance.
-
-
-
-
-
-
-
-
-