-
201.
公开(公告)号:US20220189474A1
公开(公告)日:2022-06-16
申请号:US17122875
申请日:2020-12-15
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G10L15/22 , G10L15/30 , G10L15/18 , G06F3/0488 , G10L25/51 , G06F3/16 , G06F3/0481
Abstract: Implementations described herein receive audio data that captures a spoken utterance, generate, based on processing the audio data, a recognition that corresponds to the spoken utterance, and determine, based on processing the recognition, that the spoken utterance is ambiguous (i.e., is interpretable as requesting performance of a first particular action exclusively and is also interpretable a second particular action exclusively). In response to determining that the spoken utterance is ambiguous, implementations determine to provide an enhanced clarification prompt that renders output that is in addition to natural language. The enhanced clarification prompt solicits further user interface input for disambiguating between the first particular action and the second particular action. Determining to provide the enhanced clarification prompt includes a current or prior determination to provide the enhanced clarification prompt instead of a natural language (NL) only clarification prompt that is restricted to rendering natural language.
-
公开(公告)号:US20220180868A1
公开(公告)日:2022-06-09
申请号:US17247334
申请日:2020-12-08
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: A method for streaming action fulfillment receives audio data corresponding to an utterance where the utterance includes a query to perform an action that requires performance of a sequence of sub-actions in order to fulfill the action. While receiving the audio data, but before receiving an end of speech condition, the method processes the audio data to generate intermediate automated speech recognition (ASR) results, performs partial query interpretation on the intermediate ASR results to determine whether the intermediate ASR results identify an application type needed to perform the action and, when the intermediate ASR results identify a particular application type, performs a first sub-action in the sequence of sub-actions by launching a first application to execute on the user device where the first application is associated with the particular application type. The method, in response to receiving an end of speech condition, fulfills performance of the action.
-
公开(公告)号:US11354342B2
公开(公告)日:2022-06-07
申请号:US16608628
申请日:2018-10-18
Applicant: Google LLC
Inventor: Victor Carbune , Pedro Gonnet Anders
IPC: G06F16/93 , G06F16/33 , G06F16/332 , G06F16/338 , G06F40/40 , H04L51/02 , G06V30/418 , G06F3/0482
Abstract: Techniques are described herein for determining an information gain score for one or more documents of interest to the user and present information from the documents based on the information gain score. An information gain score for a given document is indicative of additional information that is included in the document beyond information contained in documents that were previously viewed by the user. In some implementations, the information gain score may be determined for one or more documents by applying data from the documents across a machine learning model to generate an information gain score. Based on the information gain scores of a set of documents, the documents can be provided to the user in a manner that reflects the likely information gain that can be attained by the user if the user were to view the documents.
-
204.
公开(公告)号:US20220171813A1
公开(公告)日:2022-06-02
申请号:US17107286
申请日:2020-11-30
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G06F16/903 , G06F21/62
Abstract: Implementations are directed to receiving a search query from a user, obtaining environmental signal(s) associated with an environment in which the user is located when the search query is received, processing the environmental signal(s) to generate a privacy measure associated with submission of the search query, obtaining additional environmental signal(s) associated with the environment in which the user is located when user input directed to a search interface is received, processing the additional environmental signal(s) to generate an additional privacy measure associated with the user input, selecting, from a superset of historical search queries of the user, a subset of the historical search queries based on at least the privacy measure and the additional privacy measure, and causing the subset of the historical search queries to be presented to the user in response to receiving the user input directed to the search interface.
-
公开(公告)号:US20220165253A1
公开(公告)日:2022-05-26
申请号:US17103878
申请日:2020-11-24
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G10L15/06 , G10L21/0232 , G10L15/22 , G10L15/20 , G10L15/30 , G10L15/187
Abstract: A method of training a speech model includes receiving, at a voice-enabled device, a fixed set of training utterances where each training utterance in the fixed set of training utterances includes a transcription paired with a speech representation of the corresponding training utterance. The method also includes sampling noisy audio data from an environment of the voice-enabled device. For each training utterance in the fixed set of training utterances, the method further includes augmenting, using the noisy audio data sampled from the environment of the voice-enabled device, the speech representation of the corresponding training utterance to generate noisy audio samples and pairing each of the noisy audio samples with the corresponding transcription of the corresponding training utterance. The method additionally includes training a speech model on the noisy audio samples generated for each speech representation in the fixed set of training utterances.
-
公开(公告)号:US20220156130A1
公开(公告)日:2022-05-19
申请号:US17588637
申请日:2022-01-31
Applicant: Google LLC
Inventor: Sandro Feuz , Victor Carbune
IPC: G06F9/54 , H04L51/224
Abstract: Implementations set forth herein relate to intervening notifications provided by an application for mitigating computationally wasteful application launching behavior that is exhibited by some users. A state of a module of a target application can be identified by emulating user inputs previously provided by the user to the target application. In this way, the state of the module can be determined without visibly launching the target application. When the state of the module is determined to satisfy criteria for providing a notification to the user, the application can render a notification for the user. The application can provide intervening notifications for a variety of different target applications in order to reduce a frequency at which the user launches and closes applications to check for variations in target application content.
-
公开(公告)号:US20220139388A1
公开(公告)日:2022-05-05
申请号:US17086296
申请日:2020-10-30
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: A method includes receiving a first instance of raw audio data corresponding to a voice-based command and receiving a second instance of the raw audio data corresponding to an utterance of audible contents for an audio-based communication spoken by a user. When a voice filtering recognition routine determines to activate voice filtering for at least the voice of the user, the method also includes obtaining a respective speaker embedding of the user and processing, using the respective speaker embedding, the second instance of the raw audio data to generate enhanced audio data for the audio-based communication that isolates the utterance of the audible contents spoken by the user and excludes at least a portion of the one or more additional sounds that are not spoken by the user The method also includes executing.
-
公开(公告)号:US11315575B1
公开(公告)日:2022-04-26
申请号:US17069565
申请日:2020-10-13
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: Implementations relate to automatic generation of speaker features for each of one or more particular text-dependent speaker verifications (TD-SVs) for a user. Implementations can generate speaker features for a particular TD-SV using instances of audio data that each capture a corresponding spoken utterance of the user during normal non-enrollment interactions with an automated assistant via one or more respective assistant devices. For example, a portion of an instance of audio data can be used in response to: (a) determining that recognized term(s) for the spoken utterance captured by that the portion correspond to the particular TD-SV; and (b) determining that an authentication measure, for the user and for the spoken utterance, satisfies a threshold. Implementations additionally or alternatively relate to utilization of speaker features, for each of one or more particular TD-SVs for a user, in determining whether to authenticate a spoken utterance for the user.
-
公开(公告)号:US20220122599A1
公开(公告)日:2022-04-21
申请号:US17100013
申请日:2020-11-20
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G10L15/22 , G06F3/0488 , G06F3/16 , G10L25/78
Abstract: Implementations set forth relate to suggesting an alternate interface modality when an automated assistant and/or a user is expected to not understand a particular interaction between the user and the automated assistant. In some instances, the automated assistant can pre-emptively determine that a forthcoming and/or ongoing interaction between a user and an automated assistant may experience interference. Based on this determination, the automated assistant can provide an indication that the interaction may not be successful and/or that the user should interact with the automated assistant through a different modality. For example, the automated assistant can render a keyboard interface at a portable computing device when the automated assistant determines that an audio interface of the portable computing device is experiencing interference.
-
公开(公告)号:US11256992B2
公开(公告)日:2022-02-22
申请号:US16622555
申请日:2019-06-25
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Sandro Feuz
IPC: G06N5/02 , G06F40/35 , G06F40/205 , G06F40/295 , H04L51/046
Abstract: Techniques and a framework are described herein for constructing and/or updating, e.g., on top of a general-purpose knowledge graph, an “event-specific provisional knowledge graph.” In various implementations, live data stream(s) may be analyzed to identify entity(s) associated with a developing event. The entity(s) may form part of a general-purpose knowledge graph that includes entity nodes and edges between the entity nodes. Based on the identified one or more entities, an event-specific provisional knowledge graph may be constructed or updated in association with the developing event. In some implementations, the event-specific provisional knowledge graph may be queried for new information about the developing event. Computing devices may be caused to render, as output, the new information.
-
-
-
-
-
-
-
-
-