Patent search ap:("Google LLC") AND inv:"Aleksandar Kracun" Page 1

1.

发明授权
Hotword-aware speech synthesis 有权

公开(公告)号：US12067997B2

公开(公告)日：2024-08-20

申请号：US17444557

申请日：2021-08-05

Applicant: Google LLC

Inventor： Matthew Sharifi , Aleksandar Kracun

IPC: G10L13/027 , G06F18/21 , G10L13/08 , G10L17/24 , G10L21/00 , G10L25/87

CPC classification number: G10L21/00 , G06F18/217 , G10L13/027 , G10L13/086 , G10L17/24 , G10L25/87

Abstract: A method includes receiving text input data for conversion into synthesized speech and determining, using a hotword-aware model trained to detect a presence of a hotword assigned to a user device, whether a pronunciation of the text input data includes the hotword. The hotword is configured to initiate a wake-up process on the user device for processing the hotword and/or one or more other terms following the hotword in the audio input data. When the pronunciation of the text input data includes the hotword, the method also includes generating an audio output signal from the text input data and providing the audio output signal to an audio output device to output the audio output signal. The audio output signal when captured by an audio capture device of the user device, configured to prevent initiation of the wake-up process on the user device.

2.

发明公开
ADAPTING HOTWORD RECOGNITION BASED ON PERSONALIZED NEGATIVES 审中-公开

公开(公告)号：US20230386468A1

公开(公告)日：2023-11-30

申请号：US18446420

申请日：2023-08-08

Applicant: Google LLC

Inventor： Aleksandar Kracun , Matthew Sharifi

IPC: G10L15/22 , G10L15/197 , G10L17/06 , G10L17/24 , G10L15/30

CPC classification number: G10L15/22 , G10L15/197 , G10L17/06 , G10L17/24 , G10L15/30 , G10L2015/088

Abstract: A method for adapting hotword recognition includes receiving audio data characterizing a hotword event detected by a first stage hotword detector in streaming audio captured by a user device. The method also includes processing, using a second stage hotword detector, the audio data to determine whether a hotword is detected by the second stage hotword detector in a first segment of the audio data. When the hotword is not detected by the second stage hotword detector, the method includes, classifying the first segment of the audio data as containing a negative hotword that caused a false detection of the hotword event in the streaming audio by the first stage hotword detector. Based on the first segment of the audio data classified as containing the negative hotword, the method includes updating the first stage hotword detector to prevent triggering the hotword event in subsequent audio data that contains the negative hotword.

3.

发明申请
Navigation with Dynamic Regrouping Points 有权

公开(公告)号：US20220120573A1

公开(公告)日：2022-04-21

申请号：US17422664

申请日：2019-06-25

Applicant: Matthew SHARIFI , Aleksandar KRACUN , Google LLC

Inventor： Matthew Sharifi , Aleksandar Kracun

IPC: G01C21/34 , G01C21/36 , H04W4/029 , H04W4/02 , H04W4/024

Abstract: The present disclosure is directed to a system and method for providing dynamic grouping and regrouping for users in a joint positional tracking session. The method includes receiving positional data associated with a first user and at least one other user in the plurality of users in the joint positional tracking session. The method includes determining that a separation parameter associated with the first user has exceeded a threshold separation value, the separation parameter associated with the first user representing a distance between the first user and one other user in the plurality of users. The method includes automatically generating navigational data for reducing the separation parameter between the first user and one other user in the joint positional tracking session to below the threshold separation value. The method includes transmitting the navigational data to at least the first user in the joint positional tracking session.

4.

发明申请
Hotword-Aware Speech Synthesis 有权

公开(公告)号：US20210366459A1

公开(公告)日：2021-11-25

申请号：US17444557

申请日：2021-08-05

Applicant: Google LLC

Inventor： Matthew Sharifi , Aleksandar Kracun

IPC: G10L13/027 , G06K9/62 , G10L13/08 , G10L17/24 , G10L25/87

Abstract: A method includes receiving text input data for conversion into synthesized speech and determining, using a hotword-aware model trained to detect a presence of a hotword assigned to a user device, whether a pronunciation of the text input data includes the hotword. The hotword is configured to initiate a wake-up process on the user device for processing the hotword and/or one or more other terms following the hotword in the audio input data. When the pronunciation of the text input data includes the hotword, the method also includes generating an audio output signal from the text input data and providing the audio output signal to an audio output device to output the audio output signal. The audio output signal when captured by an audio capture device of the user device, configured to prevent initiation of the wake-up process on the user device.

5.

发明授权
Detecting and suppressing voice queries 有权

公开(公告)号：US10170112B2

公开(公告)日：2019-01-01

申请号：US15593278

申请日：2017-05-11

Applicant: Google LLC

Inventor： Alexander H. Gruenstein , Aleksandar Kracun , Matthew Sharifi

IPC: G10L15/08 , G06F17/30 , G10L15/22 , G10L17/00 , G10L15/26 , H04L29/06 , G10L15/06

Abstract: A computing system receives requests from client devices to process voice queries that have been detected in local environments of the client devices. The system identifies that a value that is based on a number of requests to process voice queries received by the system during a specified time interval satisfies one or more criteria. In response, the system triggers analysis of at least some of the requests received during the specified time interval to trigger analysis of at least some received requests to determine a set of requests that each identify a common voice query. The system can generate an electronic fingerprint that indicates a distinctive model of the common voice query. The fingerprint can then be used to detect an illegitimate voice query identified in a request from a client device at a later time.

6.

发明授权
Adapting automated speech recognition parameters based on hotword properties 有权

公开(公告)号：US11620990B2

公开(公告)日：2023-04-04

申请号：US17120033

申请日：2020-12-11

Applicant: Google LLC

Inventor： Matthew Sharifi , Aleksandar Kracun

IPC: G10L15/16 , G10L25/90 , G10L15/22 , G10L15/28 , G10L15/08 , G10L25/78

Abstract: A method for optimizing speech recognition includes receiving a first acoustic segment characterizing a hotword detected by a hotword detector in streaming audio captured by a user device, extracting one or more hotword attributes from the first acoustic segment, and adjusting, based on the one or more hotword attributes extracted from the first acoustic segment, one or more speech recognition parameters of an automated speech recognition (ASR) model. After adjusting the speech recognition parameters of the ASR model, the method also includes processing, using the ASR model, a second acoustic segment to generate a speech recognition result. The second acoustic segment characterizes a spoken query/command that follows the first acoustic segment in the streaming audio captured by the user device.

7.

发明授权
Contextual hotwords 有权

公开(公告)号：US11430442B2

公开(公告)日：2022-08-30

申请号：US17068681

申请日：2020-10-12

Applicant: Google LLC

Inventor： Christopher Thaddeus Hughes , Ignacio Lopez Moreno , Aleksandar Kracun

IPC: G10L15/22 , G10L15/02 , G10L15/08 , G10L15/20

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for contextual hotwords are disclosed. In one aspect, a method, during a boot process of a computing device, includes the actions of determining, by a computing device, a context associated with the computing device. The actions further include, based on the context associated with the computing device, determining a hotword. The actions further include, after determining the hotword, receiving audio data that corresponds to an utterance. The actions further include determining that the audio data includes the hotword. The actions further include, in response to determining that the audio data includes the hotword, performing an operation associated with the hotword.

8.

发明申请
Voice Query QoS based on Client-Computed Content Metadata 有权

公开(公告)号：US20220262367A1

公开(公告)日：2022-08-18

申请号：US17661625

申请日：2022-05-02

Applicant: Google LLC

Inventor： Matthew Sharifi , Aleksandar Kracun

IPC: G10L15/30 , G06F16/63 , G10L15/08 , G10L15/22 , H04L67/568

Abstract: A method includes receiving an automated speech recognition (ASR) request from a user device that includes a speech input captured by the user device and content metadata associated with the speech input. The content metadata is generated by the user device. The method also includes determining a priority score for the ASR request based on the content metadata associated with the speech input and caching the ASR request in a pre-processing backlog of pending ASR requests each having a corresponding priority score. The pending ASR requests in the pre-processing backlog are ranked in order of the priority scores. The method also includes providing, from the pre-processing backlog, one or more of the pending ASR requests to a backend-side ASR module, wherein pending ASR requests associated with higher priority scores are processed before pending ASR requests associated with lower priority scores.

9.

发明申请
END-TO-END SPEECH CONVERSION 有权

公开(公告)号：US20220122579A1

公开(公告)日：2022-04-21

申请号：US17310732

申请日：2019-11-26

Applicant: Google LLC

Inventor： Fadi Biadsy , Ron J. Weiss , Aleksandar Kracun , Pedro J. Moreno Mengibar

IPC: G10L13/02 , G10L21/10 , G10L25/30 , G06N3/08 , H04L51/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for end to end speech conversion are disclosed. In one aspect, a method includes the actions of receiving first audio data of a first utterance of one or more first terms spoken by a user. The actions further include providing the first audio data as an input to a model that is configured to receive first given audio data in a first voice and output second given audio data in a synthesized voice without performing speech recognition on the first given audio data. The actions further include receiving second audio data of a second utterance of the one or more first terms spoken in the synthesized voice. The actions further include providing, for output, the second audio data of the second utterance of the one or more first terms spoken in the synthesized voice.

10.

发明申请
Voice Query QoS Based On Client-Computed Content Metadata 有权

公开(公告)号：US20220093104A1

公开(公告)日：2022-03-24

申请号：US17310175

申请日：2019-02-06

Applicant: Google LLC

Inventor： Matthew Sharifi , Aleksandar Kracun

IPC: G10L15/30 , G10L15/22 , G10L15/08 , H04L67/568 , G06F16/63

Abstract: A method includes receiving an automated speech recognition (ASR) request from a user device that includes a speech input captured by the user device and content metadata associated with the speech input. The content metadata is generated by the user device. The method also includes determining a priority score for the ASR request based on the content metadata associated with the speech input and caching the ASR request in a pre-processing backlog of pending ASR requests each having a corresponding priority score. The pending ASR requests in the pre-processing backlog are ranked in order of the priority scores. The method also includes providing, from the pre-processing backlog, one or more of the pending ASR requests to a backend-side ASR module, wherein pending ASR requests associated with higher priority scores are processed before pending ASR requests associated with lower priority scores.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification