-
1.
公开(公告)号:US20220366911A1
公开(公告)日:2022-11-17
申请号:US17337804
申请日:2021-06-03
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Krishna Sapkota , Behshad Behzadi , Julia Proskurnia , Jacopo Sannazzaro Natta , Justin Lu , Magali Boizot-Roche , Márius Sajgalík , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
Abstract: Implementations described herein relate to an application and/or automated assistant that can identify arrangement operations to perform for arranging text during speech-to-text operations—without a user having to expressly identify the arrangement operations. In some instances, a user that is dictating a document (e.g., an email, a text message, etc.) can provide a spoken utterance to an application in order to incorporate textual content. However, in some of these instances, certain corresponding arrangements are needed for the textual content in the document. The textual content that is derived from the spoken utterance can be arranged by the application based on an intent, vocalization features, and/or contextual features associated with the spoken utterance and/or a type of the application associated with the document, without the user expressly identifying the corresponding arrangements. In this way, the application can infer content arrangement operations from a spoken utterance that only specifies the textual content.
-
公开(公告)号:US20220253277A1
公开(公告)日:2022-08-11
申请号:US17619414
申请日:2019-12-13
Applicant: GOOGLE LLC
Inventor: Srikanth Pandiri , Luv Kothari , Behshad Behzadi , Zaheed Sabur , Domenico Carbotta , Akshay Kannan , Qi Wang , Gokay Baris Gultekin , Angana Ghosh , Xu Liu , Yang Lu , Steve Cheng
IPC: G06F3/16 , G06F3/0481 , G06F40/174 , G10L15/26 , G06F3/0484 , G06F3/04886 , G06F40/30 , G06F40/143 , G06F40/117 , G10L15/22
Abstract: Implementations set forth herein relate to an automated assistant that can selectively determine whether to incorporate a verbatim interpretation of portions spoken utterances into an entry field and/or incorporate synonymous content into the entry field. For instance, a user can be accessing an interface that provides an entry field (e.g., address field) for receiving user input. In order to provide input for entry field, the user can select the entry field and/or access a GUI keyboard to initialize an automated assistant for assisting with filling the entry field. Should the user provide a spoken utterance, the user can elect to provide a spoken utterance that embodies the intended input (e.g., an actual address) or a reference to the intended input (e.g., a name). In response to the spoken utterance, the automated assistant can fill the entry field with the intended input without necessitating further input from the user.
-
公开(公告)号:US11922209B2
公开(公告)日:2024-03-05
申请号:US17898205
申请日:2022-08-29
Applicant: GOOGLE LLC
Inventor: Jason Douglas , Carey Radebaugh , Ilya Firman , Ulas Kirazci , Luv Kothari
CPC classification number: G06F9/4843 , G10L15/1822 , G10L15/22 , G10L15/30 , G06F2209/482 , G10L2015/223 , G10L2015/228 , G10L15/34 , H04L12/281 , H04L12/2816
Abstract: Systems and methods of invoking functions of agents via digital assistant applications are provided. Each action-inventory can have an address template for an action by an agent. The address template can include a portion having an input variable used to execute the action. A data processing system can parse an input audio signal from a client device to identify a request and a parameter to be executed by the agent. The data processing system can select an action-inventory for the action corresponding to the request. The data processing system can generate, using the address template, an address. The address can include a substring having the parameter used to control execution of the action. The data processing system can direct an action data structure including the address to the agent to cause the agent to execute the action and to provide output for presentation.
-
公开(公告)号:US20240022809A1
公开(公告)日:2024-01-18
申请号:US18446381
申请日:2023-08-08
Applicant: GOOGLE LLC
Inventor: Felix Weissenberger , Balint Miklos , Victor Carbune , Matthew Sharifi , Domenico Carbotta , Ray Chen , Kevin Fu , Bogdan Prisacari , Fo Lee , Mucun Lu , Neha Garg , Jacopo Sannazzaro Natta , Barbara Poblocka , Jae Seo , Matthew Miao , Thomas Qian , Luv Kothari
IPC: H04N23/60 , G06N20/00 , G10L15/22 , G10L25/51 , H04N5/92 , H04N23/61 , H04N23/62 , H04N23/66 , H04N23/80
CPC classification number: H04N23/64 , G06N20/00 , G10L15/22 , G10L25/51 , H04N5/9201 , H04N23/61 , H04N23/62 , H04N23/66 , H04N23/80 , G10L15/1822
Abstract: Implementations set forth herein relate to an automated assistant that can control a camera according to one or more conditions specified by a user. A condition can be satisfied when, for example, the automated assistant detects a particular environment feature is apparent. In this way, the user can rely on the automated assistant to identify and capture certain moments without necessarily requiring the user to constantly monitor a viewing window of the camera. In some implementations, a condition for the automated assistant to capture media data can be based on application data and/or other contextual data that is associated with the automated assistant. For instance, a relationship between content in a camera viewing window and other content of an application interface can be a condition upon which the automated assistant captures certain media data using a camera.
-
公开(公告)号:US20220413901A1
公开(公告)日:2022-12-29
申请号:US17898205
申请日:2022-08-29
Applicant: GOOGLE LLC
Inventor: Jason Douglas , Carey Radebaugh , Ilya Firman , Ulas Kirazci , Luv Kothari
Abstract: Systems and methods of invoking functions of agents via digital assistant applications are provided. Each action-inventory can have an address template for an action by an agent. The address template can include a portion having an input variable used to execute the action. A data processing system can parse an input audio signal from a client device to identify a request and a parameter to be executed by the agent. The data processing system can select an action-inventory for the action corresponding to the request. The data processing system can generate, using the address template, an address. The address can include a substring having the parameter used to control execution of the action. The data processing system can direct an action data structure including the address to the agent to cause the agent to execute the action and to provide output for presentation.
-
公开(公告)号:US20240380970A1
公开(公告)日:2024-11-14
申请号:US18784226
申请日:2024-07-25
Applicant: GOOGLE LLC
Inventor: Felix Weissenberger , Balint Miklos , Victor Carbune , Matthew Sharifi , Domenico Carbotta , Ray Chen , Kevin Fu , Bogdan Prisacari , Fo Lee , Mucun Lu , Neha Garg , Jacopo Sannazzaro Natta , Barbara Poblocka , Jae Seo , Matthew Miao , Thomas Qian , Luv Kothari
IPC: H04N23/60 , G06N20/00 , G10L15/18 , G10L15/22 , G10L25/51 , H04N5/92 , H04N23/61 , H04N23/62 , H04N23/66 , H04N23/80
Abstract: Implementations set forth herein relate to an automated assistant that can control a camera according to one or more conditions specified by a user. A condition can be satisfied when, for example, the automated assistant detects a particular environment feature is apparent. In this way, the user can rely on the automated assistant to identify and capture certain moments without necessarily requiring the user to constantly monitor a viewing window of the camera. In some implementations, a condition for the automated assistant to capture media data can be based on application data and/or other contextual data that is associated with the automated assistant. For instance, a relationship between content in a camera viewing window and other content of an application interface can be a condition upon which the automated assistant captures certain media data using a camera.
-
公开(公告)号:US12106758B2
公开(公告)日:2024-10-01
申请号:US17322765
申请日:2021-05-17
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Alvin Abdagic , Behshad Behzadi , Jacopo Sannazzaro Natta , Julia Proskurnia , Krzysztof Andrzej Goj , Srikanth Pandiri , Viesturs Zarins , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
CPC classification number: G10L15/26 , G06F3/0488 , G06N20/00 , G10L15/18 , G10L15/22 , G10L2015/223
Abstract: Systems and methods described herein relate to determining whether to incorporate recognized text, that corresponds to a spoken utterance of a user of a client device, into a transcription displayed at the client device, or to cause an assistant command, that is associated with the transcription and that is based on the recognized text, to be performed by an automated assistant implemented by the client device. The spoken utterance is received during a dictation session between the user and the automated assistant. Implementations can process, using automatic speech recognition model(s), audio data that captures the spoken utterance to generate the recognized text. Further, implementations can determine whether to incorporate the recognized text into the transcription or cause the assistant command to be performed based on touch input being directed to the transcription, a state of the transcription, and/or audio-based characteristic(s) of the spoken utterance.
-
8.
公开(公告)号:US12033637B2
公开(公告)日:2024-07-09
申请号:US17337804
申请日:2021-06-03
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Krishna Sapkota , Behshad Behzadi , Julia Proskurnia , Jacopo Sannazzaro Natta , Justin Lu , Magali Boizot-Roche , Márius {hacek over (S)}ajgalík , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
CPC classification number: G10L15/26 , G10L15/22 , G10L2015/223
Abstract: Implementations described herein relate to an application and/or automated assistant that can identify arrangement operations to perform for arranging text during speech-to-text operations—without a user having to expressly identify the arrangement operations. In some instances, a user that is dictating a document (e.g., an email, a text message, etc.) can provide a spoken utterance to an application in order to incorporate textual content. However, in some of these instances, certain corresponding arrangements are needed for the textual content in the document. The textual content that is derived from the spoken utterance can be arranged by the application based on an intent, vocalization features, and/or contextual features associated with the spoken utterance and/or a type of the application associated with the document, without the user expressly identifying the corresponding arrangements. In this way, the application can infer content arrangement operations from a spoken utterance that only specifies the textual content.
-
9.
公开(公告)号:US11948576B2
公开(公告)日:2024-04-02
申请号:US18136189
申请日:2023-04-18
Applicant: GOOGLE LLC
Inventor: Daniel Cotting , Zaheed Sabur , Lan Huo , Bryan Christopher Horling , Behshad Behzadi , Lucas Mirelmann , Michael Golikov , Denis Burakov , Steve Cheng , Bohdan Vlasyuk , Sergey Nazarov , Mario Bertschler , Luv Kothari
CPC classification number: G10L15/22 , G06F3/165 , G06F3/167 , G10L15/1815 , G10L15/30 , H04L67/568 , G10L2015/223 , H04L67/01
Abstract: Implementations can reduce the time required to obtain responses from an automated assistant through proactive caching, locally at a client device, of proactive assistant cache entries—and through on-device utilization of the proactive assistant cache entries. Different proactive cache entries can be provided to different client devices, and various implementations relate to technique(s) utilized in determining which proactive cache entries to provide to which client devices. In some of those implementations, in determining which proactive cache entries to provide (proactively or in response to a request) to a given client device, a remote system selects, from a superset of candidate proactive cache entries, a subset of the cache entries for providing to the given client device.
-
10.
公开(公告)号:US20230252989A1
公开(公告)日:2023-08-10
申请号:US18136189
申请日:2023-04-18
Applicant: GOOGLE LLC
Inventor: Daniel Cotting , Zaheed Sabur , Lan Huo , Bryan Christopher Horling , Behshad Behzadi , Lucas Mirelmann , Michael Golikov , Denis Burakov , Steve Cheng , Bohdan Vlasyuk , Sergey Nazarov , Mario Bertschler , Luv Kothari
IPC: G10L15/22 , G06F3/16 , G10L15/18 , G10L15/30 , H04L67/568
CPC classification number: G10L15/22 , G06F3/165 , G06F3/167 , G10L15/1815 , G10L15/30 , H04L67/568 , G10L2015/223 , H04L67/01
Abstract: Implementations can reduce the time required to obtain responses from an automated assistant through proactive caching, locally at a client device, of proactive assistant cache entries—and through on-device utilization of the proactive assistant cache entries. Different proactive cache entries can be provided to different client devices, and various implementations relate to technique(s) utilized in determining which proactive cache entries to provide to which client devices. In some of those implementations, in determining which proactive cache entries to provide (proactively or in response to a request) to a given client device, a remote system selects, from a superset of candidate proactive cache entries, a subset of the cache entries for providing to the given client device.
-
-
-
-
-
-
-
-
-