-
公开(公告)号:US10275464B2
公开(公告)日:2019-04-30
申请号:US14133791
申请日:2013-12-19
Applicant: Google LLC
Inventor: Matthew Sharifi
Abstract: Methods, systems, and apparatus for receiving a request that includes a user identifier of a user that submitted a search query and an entity identifier of an entity that is referenced by the search query, identifying a plurality of knowledge elements that are related to the entity, identifying, in a consumption database, one or more items that have been indicated as consumed by the user and that are associated with the entity that is referenced by the search query, assigning rank scores to the plurality of knowledge elements, based at least on identifying the one or more items, selecting one or more of the knowledge elements from among the knowledge elements based at least on the rank scores assigned to the knowledge elements, and providing, in response to the request, information associated with the entity and the one or more selected knowledge elements.
-
公开(公告)号:US20190108840A1
公开(公告)日:2019-04-11
申请号:US16216752
申请日:2018-12-11
Applicant: Google LLC
Inventor: Matthew Sharifi
CPC classification number: G10L15/22 , G06F3/04842 , G06F3/167 , G10L15/063 , G10L15/08 , G10L15/18 , G10L15/265 , G10L15/30 , G10L2015/0631 , G10L2015/0638 , G10L2015/088 , G10L2015/223
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, for each of multiple words or sub-words, audio data corresponding to multiple users speaking the word or sub-word; training, for each of the multiple words or sub-words, a pre-computed hotword model for the word or sub-word based on the audio data for the word or sub-word; receiving a candidate hotword from a computing device; identifying one or more pre-computed hotword models that correspond to the candidate hotword; and providing the identified, pre-computed hotword models to the computing device.
-
公开(公告)号:US10248440B1
公开(公告)日:2019-04-02
申请号:US15783390
申请日:2017-10-13
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , David Petrou
IPC: G06F9/451 , G06F3/0484 , G06F17/00
Abstract: Systems and methods are provided for automating user input using onscreen content. For example, a method includes receiving a selection of a first screen capture image representing a screen captured on a mobile device associated with a user, the first image having a first timestamp. The method also includes determining, using a data store of images of previously captured screens of the mobile device, a reference image from the data store that has a timestamp prior to the first timestamp, identifying a plurality of images in the data store that have respective timestamps between the timestamp for the reference image and the first timestamp, and providing the reference image, the plurality of images, and the first image to the mobile device.
-
公开(公告)号:US10170112B2
公开(公告)日:2019-01-01
申请号:US15593278
申请日:2017-05-11
Applicant: Google LLC
Inventor: Alexander H. Gruenstein , Aleksandar Kracun , Matthew Sharifi
Abstract: A computing system receives requests from client devices to process voice queries that have been detected in local environments of the client devices. The system identifies that a value that is based on a number of requests to process voice queries received by the system during a specified time interval satisfies one or more criteria. In response, the system triggers analysis of at least some of the requests received during the specified time interval to trigger analysis of at least some received requests to determine a set of requests that each identify a common voice query. The system can generate an electronic fingerprint that indicates a distinctive model of the common voice query. The fingerprint can then be used to detect an illegitimate voice query identified in a request from a client device at a later time.
-
公开(公告)号:US20180254045A1
公开(公告)日:2018-09-06
申请号:US15909519
申请日:2018-03-01
Applicant: Google LLC
Inventor: Matthew Sharifi , Jakob Nicolaus Foerster
CPC classification number: G10L17/02 , G06F21/32 , G07C9/00071 , G07C9/00158 , G10L15/1815 , G10L15/22 , G10L15/285 , G10L17/22 , G10L19/018 , G10L25/51 , G10L2015/088
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data corresponding to an utterance, determining that the audio data corresponds to a hotword, generating a hotword audio fingerprint of the audio data that is determined to correspond to the hotword, comparing the hotword audio fingerprint to one or more stored audio fingerprints of audio data that was previously determined to correspond to the hotword, detecting whether the hotword audio fingerprint matches a stored audio fingerprint of audio data that was previously determined to correspond to the hotword based on whether the comparison indicates a similarity between the hotword audio fingerprint and one of the one or more stored audio fingerprints that satisfies a predetermined threshold, and in response to detecting that the hotword audio fingerprint matches a stored audio fingerprint, disabling access to a computing device into which the utterance was spoken.
-
公开(公告)号:US20180166078A1
公开(公告)日:2018-06-14
申请号:US15875996
申请日:2018-01-19
Applicant: Google LLC
Inventor: Matthew Sharifi
CPC classification number: G10L15/22 , G06F3/04842 , G06F3/167 , G10L15/063 , G10L15/08 , G10L15/18 , G10L15/265 , G10L15/30 , G10L2015/0631 , G10L2015/0638 , G10L2015/088 , G10L2015/223
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, for each of multiple words or sub-words, audio data corresponding to multiple users speaking the word or sub-word; training, for each of the multiple words or sub-words, a pre-computed hotword model for the word or sub-word based on the audio data for the word or sub-word; receiving a candidate hotword from a computing device; identifying one or more pre-computed hotword models that correspond to the candidate hotword; and providing the identified, pre-computed hotword models to the computing device.
-
公开(公告)号:US20250166628A1
公开(公告)日:2025-05-22
申请号:US19035299
申请日:2025-01-23
Applicant: Google LLC
Inventor: Victor Carbune , Matthew Sharifi
IPC: G10L15/22 , G06F16/9032 , G10L15/26 , G10L25/78
Abstract: A method includes instructing an always-on first processor to operate in a follow-on query detection mode, and while the always-on first processor operates in the follow-on query detection mode: receiving follow-on audio data captured by the assistant-enabled device; determining, using a voice activity detection (VAD) model executing on the always-on first processor, whether or not the VAD model detects voice activity in the follow-on audio data; performing, using a speaker identification (SID) model executing on the always-on first processor, speaker verification on the follow-on audio data to determine whether the follow-on audio data includes an utterance spoken by the same user. The method also includes initiating a wake-up process on a second processor to determine whether the utterance includes a follow-on query.
-
公开(公告)号:US20250162168A1
公开(公告)日:2025-05-22
申请号:US19029255
申请日:2025-01-17
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Matthew Sharifi
Abstract: Implementations set forth herein relate to a robotic computing device that can perform certain operations, such as communicating between users in a common space, according to certain preferences of the users. When interacting with a particular user, the robotic computing device can perform an operation at a preferred location relative to the particular user based on an express or implied preference of that particular user. For instance, certain types of operations can be performed at a first location within a room, and other types of operations can be performed at a second location within the room. When an operation involves following or guiding a user, parameters for driving the robotic computing device can be selected based on preferences of the user and/or a context in which the robotic computing device is interacting with the user (e.g., whether or not the context indicates some amount of urgency).
-
公开(公告)号:US20250149022A1
公开(公告)日:2025-05-08
申请号:US18837723
申请日:2023-02-13
Applicant: Google LLC
Inventor: Zalán Borsos , Marco Tagliasacchi , Matthew Sharifi
Abstract: Provided are systems, methods, and machine learning models for filling in gaps (e.g., of up to one second) in speech samples by leveraging an auxiliary textual input. Example machine learning models described herein can perform speech inpainting with the appropriate content, while maintaining speaker identity, prosody and recording environment conditions, and generalizing to unseen speakers. This approach significantly outperforms baselines constructed using adaptive TTS, as judged by human raters in side-by-side preference and MOS tests.
-
公开(公告)号:US20250131909A1
公开(公告)日:2025-04-24
申请号:US19007920
申请日:2025-01-02
Applicant: Google LLC
Inventor: Matthew Sharifi , Jakob Nicolaus Foerster
IPC: G10L13/00 , G06F40/253 , G06F40/289 , G10L13/08
Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
-
-
-
-
-
-
-
-
-