-
31.
公开(公告)号:US20220366911A1
公开(公告)日:2022-11-17
申请号:US17337804
申请日:2021-06-03
申请人: GOOGLE LLC
发明人: Victor Carbune , Krishna Sapkota , Behshad Behzadi , Julia Proskurnia , Jacopo Sannazzaro Natta , Justin Lu , Magali Boizot-Roche , Márius Sajgalík , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
摘要: Implementations described herein relate to an application and/or automated assistant that can identify arrangement operations to perform for arranging text during speech-to-text operations—without a user having to expressly identify the arrangement operations. In some instances, a user that is dictating a document (e.g., an email, a text message, etc.) can provide a spoken utterance to an application in order to incorporate textual content. However, in some of these instances, certain corresponding arrangements are needed for the textual content in the document. The textual content that is derived from the spoken utterance can be arranged by the application based on an intent, vocalization features, and/or contextual features associated with the spoken utterance and/or a type of the application associated with the document, without the user expressly identifying the corresponding arrangements. In this way, the application can infer content arrangement operations from a spoken utterance that only specifies the textual content.
-
公开(公告)号:US20220355814A1
公开(公告)日:2022-11-10
申请号:US17273673
申请日:2020-11-18
申请人: GOOGLE LLC
发明人: Matthew Sharifi , Victor Carbune
摘要: To identify driving event sounds during navigation, a client device in a vehicle provides a set of navigation directions for traversing from a starting location to a destination location along a route. During navigation to the destination location, the client device identifies audio that includes a driving event sound from within the vehicle or an area surrounding the vehicle. In response to determining that the audio includes the driving event sound, the client device determines whether the driving event sound is artificial. In response to determining that the driving event sound is artificial, the client device presents a notification to the driver indicating that the driving event sound is artificial or masks the driving event sound to prevent the driver from hearing the driving event sound.
-
公开(公告)号:US11462219B2
公开(公告)日:2022-10-04
申请号:US17086296
申请日:2020-10-30
申请人: Google LLC
发明人: Matthew Sharifi , Victor Carbune
IPC分类号: G10L15/00 , G10L15/22 , G10L15/02 , G10L21/0208 , G10L25/78 , G10L25/87 , G10L21/0272
摘要: A method includes receiving a first instance of raw audio data corresponding to a voice-based command and receiving a second instance of the raw audio data corresponding to an utterance of audible contents for an audio-based communication spoken by a user. When a voice filtering recognition routine determines to activate voice filtering for at least the voice of the user, the method also includes obtaining a respective speaker embedding of the user and processing, using the respective speaker embedding, the second instance of the raw audio data to generate enhanced audio data for the audio-based communication that isolates the utterance of the audible contents spoken by the user and excludes at least a portion of the one or more additional sounds that are not spoken by the user The method also includes executing.
-
34.
公开(公告)号:US11366812B2
公开(公告)日:2022-06-21
申请号:US16621109
申请日:2019-06-25
申请人: Google LLC
发明人: Victor Carbune , Sandro Feuz
IPC分类号: G06F17/00 , G06F16/2455 , G06F16/953 , G06N20/00 , G06F16/901
摘要: Techniques and a framework are described herein for gathering information about developing events from multiple live data streams and pushing new pieces of information to interested individuals as those pieces of information are learned. In various implementations, a plurality of live data streams may be monitored. Based on the monitoring, a data structure that models diffusion of information through a population may be generated and applied as input across a machine learning model to generate output. The output may be indicative of a likelihood of occurrence of a developing event and/or a predicted measure of relevancy of the developing event to a particular user. Based on a determination that the likelihood and/or measure of relevancy satisfies a criterion, one or more computing devices may render, as output, information about the developing event.
-
公开(公告)号:US20220189465A1
公开(公告)日:2022-06-16
申请号:US17117799
申请日:2020-12-10
申请人: Google LLC
发明人: Matthew Sharifi , Victor Carbune
摘要: A method includes receiving audio data corresponding to an utterance spoken by a user that includes a command for a digital assistant to perform a long-standing operation, activating a set of one or more warm words associated with a respective action for controlling the long-standing operation, and associating the activated set of one or more warm words with only the user. While the digital assistant is performing the long-standing operation, the method includes receiving additional audio data corresponding to an additional utterance, identifying one of the warm words from the activated set of warm words, and performing speaker verification on the additional audio data. The method further includes performing the respective action associated with the identified one of the warm words for controlling the long-standing operation when the additional utterance was spoken by the same user that is associated with the activated set of one or more warm words.
-
公开(公告)号:US20220180866A1
公开(公告)日:2022-06-09
申请号:US17111467
申请日:2020-12-03
申请人: Google LLC
发明人: Matthew Sharifi , Victor Carbune
摘要: A method for decaying speech processing includes receiving, at a voice-enabled device, an indication of a microphone trigger event indicating a possible interaction with the device through speech where the device has a microphone that, when open, is configured to capture speech for speech recognition. In response to receiving the indication of the microphone trigger event, the method also includes instructing the microphone to open or remain open for a duration window to capture an audio stream in an environment of the device and providing the audio stream captured by the open microphone to a speech recognition system. During the duration window, the method further includes decaying a level of the speech recognition processing based on a function of the duration window and instructing the speech recognition system to use the decayed level of speech recognition processing over the audio stream captured by the open microphone.
-
公开(公告)号:US11355125B2
公开(公告)日:2022-06-07
申请号:US16618589
申请日:2018-08-06
申请人: Google LLC
摘要: Implementing and applying an adaptive and self-training CAPTCHA (“Completely Automated Public Turing test to tell Computers and Humans Apart”) assistant that distinguishes between a computer-generated communication (e.g., speech and/or typed) and communication that originates from a human. The CAPTCHA assistant utilizes a generative adversarial network that is self-training and includes a generator to generate synthetic answers and a discriminator to distinguish between human answers and synthetic answers. The trained discriminator is applied to potentially malicious remote entities, which are provided challenge phrases. Answers from the remote entities are provided to the discriminator to predict whether the answer originated from a human or was computer-generated.
-
公开(公告)号:US20220157318A1
公开(公告)日:2022-05-19
申请号:US17098013
申请日:2020-11-13
申请人: Google LLC
发明人: Matthew Sharifi , Victor Carbune
摘要: Implementations are directed to dynamically adapting which assistant on-device model(s) are locally stored at assistant devices of an assistant device group and/or dynamically adapting the assistant processing role(s) of the assistant device(s) of the assistant device group. In some of those implementations, the corresponding on-device model(s) and/or corresponding processing role(s), for each of the assistant devices of the group, is determined based on collectively considering individual processing capabilities of the assistant devices of the group. Implementations are additionally or alternatively directed to cooperatively utilizing assistant devices of a group, and their associated post-adaptation on-device model(s) and/or post-adaptation processing role(s), in cooperatively processing assistant requests that are directed to any one of the assistant devices of the group.
-
公开(公告)号:US20220147227A1
公开(公告)日:2022-05-12
申请号:US17581390
申请日:2022-01-21
申请人: GOOGLE LLC
发明人: Victor Carbune , Daniel Keysers , Thomas Deselaers
IPC分类号: G06F3/04817 , G06F3/0481 , G06F3/0482 , G06V40/10 , G06Q30/06 , G06Q50/12 , G06F9/451
摘要: Systems and methods enable a computing system to recognize a sequence of repeated actions and offer to automatically repeat any such recognized actions. An example method includes determining a current sequence of user actions is similar to previous sequence of user actions, determining whether the previous sequence is reproducible and, when reproducible, initiating display of a prompt that requests approval for completing the current sequence based on the previous sequence and, responsive to receiving an indication of approval, completing the previous sequence. Another example method included determining that a first current sequence of user interactions is complete and is not similar to any saved sequence of user interactions, saving the first current sequence as a previous sequence, identifying a second sequence as satisfying a similarity threshold with the previous sequence, and initiating display of a prompt that requests approval for saving the previous sequence as a shortcut.
-
40.
公开(公告)号:US11314930B2
公开(公告)日:2022-04-26
申请号:US16730377
申请日:2019-12-30
申请人: Google LLC
发明人: Victor Carbune , Thomas Deselaers
IPC分类号: G06F40/169 , G06F16/93 , G06F40/20 , G06N3/04
摘要: Implementations described herein determine, for a given document generated by a given source, one or more portions of content (e.g., phrase(s), image(s), paragraph(s), etc.) of the given document that may be influenced by a source perspective of the given source. Further, implementations determine one or more additional resources that are related to the given source and that are related to the portion(s) of content of the given document. Yet further, implementations utilize the additional resource(s) to determine additional content that provides context for the portion(s) that may be influenced by a source perspective. A relationship, between the additional resource(s) and the portions of the given document, can be defined. Based on the relationship being defined, the additional content can be caused to be rendered at a client device in response to the client device accessing the given document.
-
-
-
-
-
-
-
-
-