Simultaneous acoustic event detection across multiple assistant devices

    公开(公告)号:US12217736B2

    公开(公告)日:2025-02-04

    申请号:US18367859

    申请日:2023-09-13

    Applicant: GOOGLE LLC

    Abstract: Implementations can detect respective audio data that captures an acoustic event at multiple assistant devices in an ecosystem that includes a plurality of assistant devices, process the respective audio data locally at each of the multiple assistant devices to generate respective measures that are associated with the acoustic event using respective event detection models, process the respective measures to determine whether the detected acoustic event is an actual acoustic event, and cause an action associated with the actional acoustic event to be performed in response to determining that the detected acoustic event is the actual acoustic event. In some implementations, the multiple assistant devices that detected the respective audio data are anticipated to detect the respective audio data that captures the actual acoustic event based on a plurality of historical acoustic events being detected at each of the multiple assistant devices.

    Detecting and suppressing voice queries

    公开(公告)号:US12205588B2

    公开(公告)日:2025-01-21

    申请号:US17749892

    申请日:2022-05-20

    Applicant: GOOGLE LLC

    Abstract: A computing system receives requests from client devices to process voice queries that have been detected in local environments of the client devices. The system identifies that a value that is based on a number of requests to process voice queries received by the system during a specified time interval satisfies one or more criteria. In response, the system triggers analysis of at least some of the requests received during the specified time interval to trigger analysis of at least some received requests to determine a set of requests that each identify a common voice query. The system can generate an electronic fingerprint that indicates a distinctive model of the common voice query. The fingerprint can then be used to detect an illegitimate voice query identified in a request from a client device at a later time.

    Generating and/or utilizing voice authentication biasing parameters for assistant devices

    公开(公告)号:US12183348B2

    公开(公告)日:2024-12-31

    申请号:US18382735

    申请日:2023-10-23

    Applicant: GOOGLE LLC

    Abstract: Implementations are directed to biasing speaker authentication on a per-user basis and on a device-by-device basis and/or contextual feature(s) basis. In some of those implementations, in performing speaker authentication based on a spoken utterance, different biasing parameters are determined for each of multiple different registered users of an assistant device at which the spoken utterance was detected. In those implementations, each of the biasing parameters can be used to make it more likely or less likely (in dependence of the biasing parameter) that a corresponding registered user will be verified using the speaker authentication. Through utilization of biasing parameter(s) in performing speaker authentication, accuracy and/or robustness of speaker authentication can be increased.

    System and method for identifying places using contextual information

    公开(公告)号:US12164584B2

    公开(公告)日:2024-12-10

    申请号:US17013954

    申请日:2020-09-08

    Applicant: Google LLC

    Abstract: The present disclosure provides a computing device and method for providing personal specific information based on semantic queries. The semantic queries may be input in a natural language form, and may include user specific context, such as by referring to prior or future events related to a place the user is searching for. With the user's authorization, data associated with prior or planned activities of the user may be accessed and information from the accessed data may be identified, wherein the information is correlated with the user specific context. One or more query results are determined based on the identified information and provided for output to the user.

    Inferring semantic label(s) for assistant device(s) based on device-specific signal(s)

    公开(公告)号:US12164572B2

    公开(公告)日:2024-12-10

    申请号:US18531015

    申请日:2023-12-06

    Applicant: GOOGLE LLC

    Abstract: Implementations can identify a given assistant device from among a plurality of assistant devices in an ecosystem, obtain device-specific signal(s) that are generated by the given assistant device, process the device-specific signal(s) to generate candidate semantic label(s) for the given assistant device, select a given semantic label for the given semantic device from among the candidate semantic label(s), and assigning, in a device topology representation of the ecosystem, the given semantic label to the given assistant device. Implementations can optionally receive a spoken utterance that includes a query or command at the assistant device(s), determine a semantic property of the query or command matches the given semantic label to the given assistant device, and cause the given assistant device to satisfy the query or command.

    Smart suggestions for image zoom regions

    公开(公告)号:US12164556B2

    公开(公告)日:2024-12-10

    申请号:US17336000

    申请日:2021-06-01

    Applicant: GOOGLE LLC

    Abstract: Techniques are described herein for providing smart suggestions for image zoom regions. A method includes: receiving a search query; performing a search using the search query to identify search results that include image search results including a plurality of images that are responsive to the search query; for a given image of the plurality of images included in the image search results, determining at least one zoom region in the given image; and providing the search results including the image search results, including providing the given image and an indication of the at least one zoom region in the given image.

    Voice-based scene selection for video content on a computing device

    公开(公告)号:US12149773B2

    公开(公告)日:2024-11-19

    申请号:US17902601

    申请日:2022-09-02

    Applicant: GOOGLE LLC

    Abstract: Voice-based interaction with video content being presented by a media player application is enhanced through the use of an automated assistant capable of identifying when a spoken utterance by a user is a request to playback a specific scene in the video content. A query identified in a spoken utterance may be used to access stored scene metadata associated with video content being presented in the vicinity of the user to identify one or more locations in the video content that correspond to the query, such that a media control command may be issued to the media player application to cause the media player application to seek to a particular location in the video content that satisfies the query.

    Condition-Aware Generation of Panoramic Imagery

    公开(公告)号:US20240378700A1

    公开(公告)日:2024-11-14

    申请号:US18781937

    申请日:2024-07-23

    Applicant: GOOGLE LLC

    Abstract: System and methods are provided for generating panoramic imagery. An example method may be performed by one or more processors and includes obtaining first panoramic imagery depicting a geographic area. The method also includes obtaining an image depicting one or more physical objects absent from the first panoramic imagery. Further, the method includes transforming the first panoramic imagery into second panoramic imagery depicting the one or more physical objects and including at least a portion of the first panoramic imagery.

    Training keyword spotters
    119.
    发明授权

    公开(公告)号:US12136412B2

    公开(公告)日:2024-11-05

    申请号:US17662021

    申请日:2022-05-04

    Applicant: Google LLC

    Abstract: A method of training a custom hotword model includes receiving a first set of training audio samples. The method also includes generating, using a speech embedding model configured to receive the first set of training audio samples as input, a corresponding hotword embedding representative of a custom hotword for each training audio sample of the first set of training audio samples. The speech embedding model is pre-trained on a different set of training audio samples with a greater number of training audio samples than the first set of training audio samples. The method further includes training the custom hotword model to detect a presence of the custom hotword in audio data. The custom hotword model is configured to receive, as input, each corresponding hotword embedding and to classify, as output, each corresponding hotword embedding as corresponding to the custom hotword.

    Warm word arbitration between automated assistant devices

    公开(公告)号:US12106755B2

    公开(公告)日:2024-10-01

    申请号:US17573418

    申请日:2022-01-11

    Applicant: GOOGLE LLC

    CPC classification number: G10L15/22 G10L15/30 G10L15/32

    Abstract: Techniques are described herein for warm word arbitration between automated assistant devices. A method includes: determining that warm word arbitration is to be initiated between a first assistant device and one or more additional assistant devices, including a second assistant device; broadcasting, by the first assistant device, to the one or more additional assistant devices, an active set of warm words for the first assistant device; for each of the one or more additional assistant devices, receiving, from the additional assistant device, an active set of warm words for the additional assistant device; identifying a matching warm word included in the active set of warm words for the first assistant device and included in the active set of warm words for the second assistant device; and enabling or disabling detection of the matching warm word by the first assistant device, in response to identifying the matching warm word.

Patent Agency Ranking