-
公开(公告)号:US11948572B2
公开(公告)日:2024-04-02
申请号:US17971997
申请日:2022-10-24
申请人: GOOGLE LLC
发明人: Gaurav Bhaya , Robert Stets
IPC分类号: G10L15/22 , G06F40/205 , G10L13/027 , G10L15/08 , G10L15/18 , G10L15/30 , G10L21/003 , G10L21/0316 , H04L65/1069
CPC分类号: G10L15/22 , G06F40/205 , G10L13/027 , G10L15/1815 , G10L15/1822 , G10L15/30 , G10L21/003 , G10L21/0316 , H04L65/1069 , G10L2015/088 , G10L2015/223
摘要: Modulating packetized audio signals in a voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request, and generate a first action data structure. The system can identify a content item object based on the trigger keyword, and generate an output signal comprising a first portion corresponding to the first action data structure and a second portion corresponding to the content item object. The system can apply a modulation to the first or second portion of the output signal, and transmit the modulated output signal to the device.
-
公开(公告)号:US20230274205A1
公开(公告)日:2023-08-31
申请号:US18135579
申请日:2023-04-17
申请人: GOOGLE LLC
IPC分类号: G06Q10/0631 , G10L15/22
CPC分类号: G06Q10/063112 , G10L15/22 , G10L2015/223 , G06F16/951
摘要: An example method includes receiving, by one or more processors, a representation of an utterance spoken at a computing device; identifying, by a first computational agent from a plurality of computational agents and based on the utterance, a multi-element task to be performed, wherein the plurality of computational agents includes one or more first party computational agents and a plurality of third-party computational agents; and performing, by the first computational agent, a first sub-set of elements of the multi-element task, wherein performing the first sub-set of elements comprises selecting a second computational agent from the plurality of computational agents to perform a second sub-set of elements of the multi-element task.
-
公开(公告)号:US20230111040A1
公开(公告)日:2023-04-13
申请号:US17971997
申请日:2022-10-24
申请人: GOOGLE LLC
发明人: Gaurav Bhaya , Robert Stets
IPC分类号: G10L15/22 , H04L65/1069 , G10L15/18 , G10L15/30 , G10L21/003 , G10L21/0316 , G10L13/027 , G06F40/205
摘要: Modulating packetized audio signals in a voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request, and generate a first action data structure. The system can identify a content item object based on the trigger keyword, and generate an output signal comprising a first portion corresponding to the first action data structure and a second portion corresponding to the content item object. The system can apply a modulation to the first or second portion of the output signal, and transmit the modulated output signal to the device.
-
公开(公告)号:US11625402B2
公开(公告)日:2023-04-11
申请号:US16915231
申请日:2020-06-29
申请人: Google LLC
发明人: Gaurav Bhaya , Robert Stets
IPC分类号: G10L15/00 , G06F16/2455 , G10L15/18 , G10L15/30 , G10L15/22 , G06F16/242 , G10L15/08
摘要: Systems and methods of voice activated thread management in a voice activated data packet based environment are provided. A natural language processor (“NLP”) component can receive and parse data packets comprising a first input audio signal to identify a first request and a first trigger keyword. A direct action application programming interface (“API”) can generate a first action data structure with a parameter defining a first action. The NLP component can receive and parse a second input audio signal to identify a second request and a second trigger keyword, and can generate a second action data structure with a parameter defining a second action. A pooling component can generate the first and second action data structures into a pooled data structure, and can transmit the pooled data structure to a service provider computing device to cause it device to perform an operation defined by the pooled data structure.
-
公开(公告)号:US11194893B2
公开(公告)日:2021-12-07
申请号:US15862963
申请日:2018-01-05
申请人: Google LLC
发明人: Ken Krieger , Andrew Joseph Alexander Gildfind , Nicholas Salvatore Arini , Simon Michael Rowe , Raimundo Mirisola , Gaurav Bhaya , Robert Stets
IPC分类号: H04N21/422 , H04N21/436 , G06F21/32 , H04L29/06 , G06K9/00 , G10L17/24 , G06F21/34 , G06F21/31 , G06F21/35 , H04N21/4223 , H04N21/442 , G10L17/00
摘要: The present disclosure is generally directed a data processing system for authenticating packetized audio signals in a voice activated computer network environment. The data processing system can improve the efficiency and effectiveness of auditory data packet transmission over one or more computer networks by, for example, disabling malicious transmissions prior to their transmission across the network. The present solution can also improve computational efficiency by disabling remote computer processes possibly affected by or caused by the malicious audio signal transmissions. By disabling the transmission of malicious audio signals, the system can reduce bandwidth utilization by not transmitting the data packets carrying the malicious audio signal across the networks.
-
公开(公告)号:US20210365628A1
公开(公告)日:2021-11-25
申请号:US17393250
申请日:2021-08-03
申请人: Google LLC
发明人: Boon-Lock Yeo , Xuemei GU , Gangjiang Li , Gaurav Bhaya , Robert Stets
IPC分类号: G06F40/134 , G06K9/00 , G06F16/432 , G06F16/583 , G06F40/279
摘要: Systems and methods for extracting audiovisual features from images and other digital components. A data processing system can extract image data and image features from an input image. The data processing system can match the image features to the image features of a plurality of image to identify candidate images. A second image can be selected from the candidate images based on a request that the data processing system received with the input image.
-
公开(公告)号:US11087760B2
公开(公告)日:2021-08-10
申请号:US16696622
申请日:2019-11-26
申请人: Google LLC
发明人: Gaurav Bhaya , Robert Stets
IPC分类号: G10L15/00 , G10L15/22 , H04L12/825 , G10L15/26 , G06F3/16 , G10L15/14 , G10L15/18 , G10L15/30 , G10L15/08
摘要: A system of multi-modal transmission of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify first and second candidate interfaces, and respective resource utilization values. The interface management component can select, based on the resource utilization values, the first candidate interface to present the content item. The interface management component can provide the first action data structure to the client computing device for rendering as audio output, and can transmit the content item converted for a first modality to deliver the content item for rendering from the selected interface.
-
公开(公告)号:US20210097997A1
公开(公告)日:2021-04-01
申请号:US17104645
申请日:2020-11-25
申请人: Google LLC
发明人: Gaurav Bhaya , Robert Stets
IPC分类号: G10L15/22 , G10L15/26 , G06F3/16 , G06F16/332 , G06F16/33 , G06F40/40 , G06F40/284 , G10L15/00 , G10L15/14 , G10L15/18
摘要: Optimization of sequence dependent operations in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. A prediction component can determine a thread based on the trigger keyword and the request that includes a first action, a second action subsequent to the first action, and a third action subsequent to the second action. A content selector component can select, based on the third action and the trigger keyword, a content item. An audio signal generator component can generate an output signal comprising the content item. An interface can transmit the output signal to cause a client computing device to drive a speaker to generate an acoustic wave corresponding to the output signal prior to occurrence of at least one of the first action and the second action.
-
公开(公告)号:US10853747B2
公开(公告)日:2020-12-01
申请号:US15815353
申请日:2017-11-16
申请人: Google LLC
发明人: Bo Wang , Lei Zhong , Barnaby John James , Saisuresh Krishnakumaran , Robert Stets , Bogdan Caprita , Valerie Nygaard
IPC分类号: G10L15/22 , G06Q10/06 , G10L15/08 , G06F16/951 , G10L13/00
摘要: An example method includes receiving, by a computational assistant executing at one or more processors, a representation of an utterance spoken at a computing device; identifying, based on the utterance, a task to be performed; determining a capability level of a first party (1P) agent to perform the task; determining capability levels of respective third party (3P) agents of a plurality of 3P agents to perform the task; responsive to determining that the capability level of the 1P agent does not satisfy a threshold capability level, that a capability level of a particular 3P agent of the plurality of 3P agents is a greatest of the determined capability levels, and that the capability level of the particular 3P agent satisfies the threshold capability level, selecting the particular 3P agent to perform the task; and performing one or more actions determined by the selected agent to perform the task.
-
公开(公告)号:US10748541B2
公开(公告)日:2020-08-18
申请号:US16666780
申请日:2019-10-29
申请人: Google LLC
发明人: Gaurav Bhaya , Robert Stets , Umesh Patil
摘要: A system of multi-modal transmission of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify first and second candidate interfaces, and respective resource utilization values. The interface management component can select, based on the resource utilization values, the first candidate interface to present the content item. The interface management component can provide the first action data structure to the client computing device for rendering as audio output, and can transmit the content item converted for a first modality to deliver the content item for rendering from the selected interface.
-
-
-
-
-
-
-
-
-