-
公开(公告)号:US20230352010A1
公开(公告)日:2023-11-02
申请号:US18221274
申请日:2023-07-12
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G10L15/197 , G10L15/22 , G06F16/27 , G06N20/00 , G06N5/043
CPC classification number: G10L15/197 , G10L15/22 , G06F16/27 , G06N20/00 , G06N5/043 , G10L15/1815
Abstract: Techniques are described herein for cross-device data synchronization based on simultaneous hotword triggers. A method includes: executing a first instance of an automated assistant in an inactive state at least in part on a first computing device operated by a user; while in the inactive state, receiving, via one or more microphones of the first computing device, audio data that captures a spoken utterance of the user; processing the audio data using a machine learning model to generate a predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, performing arbitration with at least one other computing device that is executing at least in part at least one other instance of the automated assistant; and in response to performing arbitration with the at least one other computing device, initiating synchronization of user data or configuration data between the first instance of the automated assistant on the first computing device and the at least one other instance of the automated assistant on the at least one other computing device, the user data comprising data that is based on one or more interactions with the user at the first computing device, the one or more interactions occurring prior to the receiving of the audio data.
-
152.
公开(公告)号:US11775324B2
公开(公告)日:2023-10-03
申请号:US17665643
申请日:2022-02-07
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Matthew Sharifi
Abstract: Automated content switching rules may be generated and/or utilized for automatically switching away from certain interactive content during presentation of that interactive content when one or more switch conditions are met. In some instances, automated content switching rules may define one or more non-temporal switch conditions, e.g., based upon reaching certain points or milestones in certain interactive content, that may be used to initiate actions that switch away from the interactive content. In addition, in some instances, automated content switching rules may be used to not only switch away from particular interactive content, but additionally switch to other interactive content, thereby enabling a user to effectively schedule a workflow across different interactive content, applications and/or other computer-related tasks.
-
公开(公告)号:US11756530B2
公开(公告)日:2023-09-12
申请号:US17640579
申请日:2020-09-25
Applicant: GOOGLE LLC
Inventor: Marco Tagliasacchi , Mihajlo Velimirovic , Matthew Sharifi , Dominik Roblek , Christian Frank , Beat Gfeller
IPC: G10L15/06 , G10L21/013 , G10L25/30 , G10L25/90
CPC classification number: G10L15/063 , G10L21/013 , G10L25/30 , G10L25/90
Abstract: Example embodiments relate to techniques for training artificial neural networks or oilier machine-learning encoders to accurately predict the pitch of input audio samples in a semitone or otherwise logarithmically-scaled pitch space. An example method may include generating, from a sample of audio data, two training samples by applying two different pitch shifts to the sample of audio training data. This can be done by converting the sample of audio data into the frequency domain and then shifting the transformed data. These known shifts are then compared to the predicted pitches generated by applying the two training samples to the encoder. The encoder is then updated based on the comparison, such that the relative pitch output by the encoder is improved with respect to accuracy. One or more audio samples, labeled with absolute pitch values, can then be used to calibrate the relative pitch values generated by the trained encoder.
-
公开(公告)号:US11727925B2
公开(公告)日:2023-08-15
申请号:US17115484
申请日:2020-12-08
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
CPC classification number: G10L15/197 , G06F16/27 , G06N5/043 , G06N20/00 , G10L15/22 , G10L15/1815 , G10L2015/088 , G10L2015/223
Abstract: Techniques are described herein for cross-device data synchronization based on simultaneous hotword triggers. A method includes: executing a first instance of an automated assistant in an inactive state at least in part on a first computing device operated by a user; while in the inactive state, receiving, via one or more microphones of the first computing device, audio data that captures a spoken utterance of the user; processing the audio data using a machine learning model to generate a predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, performing arbitration with at least one other computing device that is executing at least in part at least one other instance of the automated assistant; and in response to performing arbitration with the at least one other computing device, initiating synchronization of user data or configuration data between the first instance of the automated assistant on the first computing device and the at least one other instance of the automated assistant on the at least one other computing device, the user data comprising data that is based on one or more interactions with the user at the first computing device, the one or more interactions occurring prior to the receiving of the audio data.
-
公开(公告)号:US11722731B2
公开(公告)日:2023-08-08
申请号:US17103908
申请日:2020-11-24
Applicant: Google LLC
Inventor: Victor Carbune , Matthew Sharifi
IPC: H04N21/422 , H04N21/4223 , H04N21/433 , H04N21/439 , H04N21/442 , H04N21/458 , G06F3/16 , G06N3/08 , H04N21/45 , H04N21/466 , H04N21/472
CPC classification number: H04N21/458 , H04N21/4396 , H04N21/44218 , H04N21/4532 , H04N21/466 , H04N21/47202 , H04N21/47217
Abstract: While an assistant-enabled device is playing back media content, a method includes receiving a contextual signal from an environment of the assistant-enabled device and executing an event recognition routine to determine whether the received contextual signal is indicative of an event that conflicts with the playback of the media content from the assistant-enabled device. When the event recognition routine determines that the received contextual signal is indicative of the event that conflicts with the playback of the media content, the method also includes adjusting content playback settings of the assistant-enabled device.
-
公开(公告)号:US20230230578A1
公开(公告)日:2023-07-20
申请号:US17579949
申请日:2022-01-20
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
CPC classification number: G10L15/05 , G10L15/22 , G10L15/26 , G10L2015/227
Abstract: A personalized endpointing measure can be used to determine whether a user has finished speaking a spoken utterance. Various implementations include using the personalized endpointing measure to determine whether a candidate endpoint indicates a user has finished speaking the spoken utterance or whether the user has paused and has not finished speaking the spoken utterance. Various implementations include determining the personalized endpointing measure based on a portion of a text representation of the spoken utterance immediately preceding the candidate endpoint and a user-specific measure. Additionally or alternatively, the user-specific measure can be based on the text representation immediately preceding the candidate endpoint and one or more historical interactions with the user. In various implementations, each of the historical interactions are specific to the text representation and the user, and indicate whether a previous instance of the text representation was a previous endpoint for the user.
-
公开(公告)号:US20230223014A1
公开(公告)日:2023-07-13
申请号:US18188238
申请日:2023-03-22
Applicant: Google LLC
Inventor: Matthew Sharifi , Aleksandar Kracun
CPC classification number: G10L15/16 , G10L15/22 , G10L15/28 , G10L25/90 , G10L2015/088 , G10L2025/783
Abstract: A method for optimizing speech recognition includes receiving a first acoustic segment characterizing a hotword detected by a hotword detector in streaming audio captured by a user device, extracting one or more hotword attributes from the first acoustic segment, and adjusting, based on the one or more hotword attributes extracted from the first acoustic segment, one or more speech recognition parameters of an automated speech recognition (ASR) model. After adjusting the speech recognition parameters of the ASR model, the method also includes processing, using the ASR model, a second acoustic segment to generate a speech recognition result. The second acoustic segment characterizes a spoken query/command that follows the first acoustic segment in the streaming audio captured by the user device.
-
公开(公告)号:US11688392B2
公开(公告)日:2023-06-27
申请号:US17115742
申请日:2020-12-08
Applicant: Google LLC
Inventor: Matthew Sharifi , Aleksandar Kracun
CPC classification number: G10L15/16 , G10L15/05 , G10L2015/088
Abstract: A method for detecting freeze words includes receiving audio data that corresponds to an utterance spoken by a user and captured by a user device associated with the user. The method also includes processing, using a speech recognizer, the audio data to determine that the utterance includes a query for a digital assistant to perform an operation. The speech recognizer is configured to trigger endpointing of the utterance after a predetermined duration of non-speech in the audio data. Before the predetermined duration of non-speech, the method includes detecting a freeze word in the audio data. In response to detecting the freeze word in the audio data, the method also includes triggering a hard microphone closing event at the user device. The hard microphone closing event prevents the user device from capturing any audio subsequent to the freeze word.
-
公开(公告)号:US11677995B2
公开(公告)日:2023-06-13
申请号:US17373040
申请日:2021-07-12
Applicant: Google LLC
Inventor: Matthew Sharifi
IPC: H04N21/235 , H04N21/234 , H04N21/25 , H04N21/466 , H04N21/84
CPC classification number: H04N21/2353 , H04N21/23418 , H04N21/251 , H04N21/4668 , H04N21/84
Abstract: Systems and methods for matching live media content are disclosed. At a server, obtaining first media content from a client device, herein the first media content corresponds to a portion of media content being played on the client device, and the first media content is associated with a predefined expiration time; obtaining second media content from one or more content feeds, wherein the second media content also corresponds to a portion of the media content being played on the client device; in accordance with a determination that the second media content corresponds to a portion of the media content that has been played on the client device: before the predefined expiration time, obtaining third media content corresponding to the media content being played on the client device, from the one or more content feeds; and comparing the first media content with the third media content.
-
公开(公告)号:US20230168101A1
公开(公告)日:2023-06-01
申请号:US17057074
申请日:2020-08-18
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Matthew Sharifi
IPC: G01C21/36
CPC classification number: G01C21/3647 , G01C21/3641 , G01C21/3629
Abstract: To present a navigation directions preview, a server device receives a request for navigation directions from a starting location to a destination location and generates a set of navigation directions in response to the request. The set of navigation directions includes a set of route segments for traversing from the starting location to the destination location. The server device selects a subset of the route segments based on characteristics of each route segment in the set of route segments. For each selected route segment, the server device provides a preview of the route segment to be displayed on a client device. The preview of the route segment includes panoramic street level imagery depicting the route segment.
-
-
-
-
-
-
-
-
-