Adapting automated assistant based on detected mouth movement and/or gaze

    公开(公告)号:US11614794B2

    公开(公告)日:2023-03-28

    申请号:US16606030

    申请日:2018-05-04

    Applicant: Google LLC

    Abstract: Adapting an automated assistant based on detecting: movement of a mouth of a user; and/or that a gaze of the user is directed at an assistant device that provides an automated assistant interface (graphical and/or audible) of the automated assistant. The detecting of the mouth movement and/or the directed gaze can be based on processing of vision data from one or more vision components associated with the assistant device, such as a camera incorporated in the assistant device. The mouth movement that is detected can be movement that is indicative of a user (to whom the mouth belongs) speaking.

    INVOKING AUTOMATED ASSISTANT FUNCTION(S) BASED ON DETECTED GESTURE AND GAZE

    公开(公告)号:US20230053873A1

    公开(公告)日:2023-02-23

    申请号:US17981181

    申请日:2022-11-04

    Applicant: GOOGLE LLC

    Abstract: Invoking one or more previously dormant functions of an automated assistant in response to detecting, based on processing of vision data from one or more vision components: (1) a particular gesture (e.g., of one or more “invocation gestures”) of a user; and/or (2) detecting that a gaze of the user is directed at an assistant device that provides an automated assistant interface (graphical and/or audible) of the automated assistant. For example, the previously dormant function(s) can be invoked in response to detecting the particular gesture, detecting that the gaze of the user is directed at an assistant device for at least a threshold amount of time, and optionally that the particular gesture and the directed gaze of the user co-occur or occur within a threshold temporal proximity of one another.

    AUTOMATED ASSISTANT INTERACTION PREDICTION USING FUSION OF VISUAL AND AUDIO INPUT

    公开(公告)号:US20220310094A1

    公开(公告)日:2022-09-29

    申请号:US17211409

    申请日:2021-03-24

    Applicant: Google LLC

    Abstract: Techniques are described herein for detecting and/or enrolling (or commissioning) new “hot commands” that are usable to cause an automated assistant to perform responsive action(s) without having to be first explicitly invoked. In various implementations, an automated assistant may be transitioned from a limited listening state into a full speech recognition state in response to a trigger event. While in the full speech recognition state, the automated assistant may receive and perform speech recognition processing on a spoken command from a user to generate a textual command. The textual command may be determined to satisfy a frequency threshold in a corpus of textual commands. Consequently, data indicative of the textual command may be enrolled as a hot command. Subsequent utterance of another textual command that is semantically consistent with the textual command may trigger performance of a responsive action by the automated assistant, without requiring explicit invocation.

    METHODS AND SYSTEMS FOR ATTENDING TO A PRESENTING USER

    公开(公告)号:US20210334070A1

    公开(公告)日:2021-10-28

    申请号:US17370656

    申请日:2021-07-08

    Applicant: GOOGLE LLC

    Abstract: The various implementations described herein include methods, devices, and systems for attending to a presenting user. In one aspect, a method is performed at an electronic device that includes an image sensor, microphones, a display, processor(s), and memory. The device (1) obtains audio signals by concurrently receiving audio data at each microphone; (2) determines based on the obtained audio signals that a person is speaking in a vicinity of the device; (3) obtains video data from the image sensor; (4) determines via the video data that the person is not within a field of view of the image sensor; (5) reorients the electronic device based on differences in the received audio data; (6) after reorienting the electronic device, obtains second video data from the image sensor and determines that the person is within the field of view; and (7) attends to the person by directing the display toward the person.

    SELECTIVE DETECTION OF VISUAL CUES FOR AUTOMATED ASSISTANTS

    公开(公告)号:US20210232231A1

    公开(公告)日:2021-07-29

    申请号:US17229285

    申请日:2021-04-13

    Applicant: Google LLC

    Abstract: Techniques are described herein for reducing false positives in vision sensor-equipped assistant devices. In various implementations, initial image frame(s) may be obtained from vision sensor(s) of an assistant device and analyzed to classify a particular region of the initial image frames as being likely to contain visual noise. Subsequent image frame(s) obtained from the vision sensor(s) may then be analyzed to detect actionable user-provided visual cue(s), in a manner that reduces or eliminates false positives. In some implementations, no analysis may be performed on the particular region of the subsequent image frame(s). Additionally or alternatively, in some implementations, a first candidate visual cue detected within the particular region may be weighted less heavily than a second candidate visual cue detected elsewhere in the one or more subsequent image frames. An automated assistant may then take responsive action based on the detected actionable visual cue(s).

    Server-Provided Visual Output at a Voice Interface Device

    公开(公告)号:US20190385418A1

    公开(公告)日:2019-12-19

    申请号:US16460648

    申请日:2019-07-02

    Applicant: GOOGLE LLC

    Abstract: A method at an electronic device with an array of indicator lights includes: obtaining first visual output instructions stored at the electronic device, where the first visual output instructions control operation of the array of indicator lights based on operating state of the electronic device; receiving a voice input; obtaining from a remote system a response to the voice input and second visual output instructions, where the second visual output instructions are provided by the remote system along with the response in accordance with a determination that the voice input satisfies one or more criteria; executing the response; and displaying visual output on the array of indicator lights in accordance with the second visual output instructions, where otherwise in absence of the second visual output instructions the electronic device displays visual output on the array of indicator lights in accordance with the first visual output instructions.

    Transmitter and receiver tracking techniques for user devices in a MIMO network

    公开(公告)号:US09973330B1

    公开(公告)日:2018-05-15

    申请号:US15412234

    申请日:2017-01-23

    Applicant: Google LLC

    Abstract: A technique includes (i) receiving a first pilot signal from a base station via a receiver of a client device, or (ii) transmitting a second pilot signal from the client device to the base station via a transmitter of the client device. First time differences and signal quality values for N samples of N respective packets in the first pilot signal are determined. Second time differences and signal quality values are received via the receiver. The second time differences and signal quality values are generated for M samples of M respective packets in the second pilot signal. An offset value is determined based on (i) the first time differences and signal quality values, or (ii) the second time differences and signal quality values. Activation or deactivation times of the receiver or the transmitter or transmission times of the transmitter are adjusted based on the offset value.

    Generating and/or adapting automated assistant content according to a distance between user(s) and an automated assistant interface

    公开(公告)号:US12277259B2

    公开(公告)日:2025-04-15

    申请号:US18375876

    申请日:2023-10-02

    Applicant: GOOGLE LLC

    Abstract: Methods, apparatus, systems, and computer-readable media are provided for generating and/or adapting automated assistant content according to a distance of a user relative to an automated assistant interface that renders the automated assistant content. For instance, the automated assistant can provide data for a client device to render. The client device can request additional data when the user relocates closer to, or further from, the client device. In some implementations, a request for additional data can identify a distance between the user and the client device. In this way, the additional data can be generated or selected according to the distance in the request. Other implementations can allow an automated assistant to determine an active user from a group of users in an environment, and determine a distance between the active user and the client device in order that any rendered content can be tailored for the active user.

    Automated assistant interaction prediction using fusion of visual and audio input

    公开(公告)号:US11842737B2

    公开(公告)日:2023-12-12

    申请号:US17211409

    申请日:2021-03-24

    Applicant: Google LLC

    Abstract: Techniques are described herein for detecting and/or enrolling (or commissioning) new “hot commands” that are usable to cause an automated assistant to perform responsive action(s) without having to be first explicitly invoked. In various implementations, an automated assistant may be transitioned from a limited listening state into a full speech recognition state in response to a trigger event. While in the full speech recognition state, the automated assistant may receive and perform speech recognition processing on a spoken command from a user to generate a textual command. The textual command may be determined to satisfy a frequency threshold in a corpus of textual commands. Consequently, data indicative of the textual command may be enrolled as a hot command. Subsequent utterance of another textual command that is semantically consistent with the textual command may trigger performance of a responsive action by the automated assistant, without requiring explicit invocation.

Patent Agency Ranking