-
公开(公告)号:US20230101572A1
公开(公告)日:2023-03-30
申请号:US18074691
申请日:2022-12-05
Applicant: GOOGLE LLC
Inventor: Aleks Kracun , Niranjan Subrahmanya , Aishanee Shah
IPC: G10L15/197 , G10L15/06 , G10L15/22
Abstract: Techniques are described herein for improving performance of machine learning model(s) and thresholds utilized in determining whether automated assistant function(s) are to be initiated. A method includes: receiving, via one or more microphones of a client device, audio data that captures a spoken utterance of a user; processing the audio data using a machine learning model to generate a predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a secondary threshold that is less indicative of the one or more hotwords being present in the audio data than is a primary threshold; in response to determining that the predicted output satisfies the secondary threshold, prompting the user to indicate whether or not the spoken utterance includes a hotword; receiving, from the user, a response to the prompting; and adjusting the primary threshold based on the response.
-
公开(公告)号:US20240021207A1
公开(公告)日:2024-01-18
申请号:US18373244
申请日:2023-09-26
Applicant: GOOGLE LLC
Inventor: Aleks Kracun , Matthew Sharifi
Abstract: Techniques are described herein for multi-factor audio watermarking. A method includes: receiving audio data; processing the audio data to generate predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, processing the audio data using automatic speech recognition to generate a speech transcription feature; detecting a watermark that is embedded in the audio data; and in response to detecting the watermark: determining that the speech transcription feature corresponds to one of a plurality of stored speech transcription features; and in response to determining that the speech transcription feature corresponds to one of the plurality of stored speech transcription features, suppressing processing of a query included in the audio data.
-
公开(公告)号:US20210364307A1
公开(公告)日:2021-11-25
申请号:US17252260
申请日:2019-12-17
Applicant: GOOGLE LLC
Inventor: Aleks Kracun , Matthew Sharifi
Abstract: A dataset descriptive of multiple locations and one or more maneuvers attempted by vehicles at these locations is received. A machine-learning model is trained using this dataset, so that the machine-learning model is configured to generate metrics of difficulty for the set of maneuvers. A query data including indications of a location and a maneuver to be executed by a vehicle at the location is received. The query data is applied to the machine-learning model to generate a metric of difficulty for the maneuver, and a navigation instruction for the maneuver is provided via a user interface, such that at least one parameter of the navigation instruction is selected based on the generated metric of difficulty.
-
公开(公告)号:US12027160B2
公开(公告)日:2024-07-02
申请号:US18074691
申请日:2022-12-05
Applicant: GOOGLE LLC
Inventor: Aleks Kracun , Niranjan Subrahmanya , Aishanee Shah
IPC: G10L15/22 , G10L15/06 , G10L15/197 , G10L15/08
CPC classification number: G10L15/197 , G10L15/063 , G10L15/22 , G10L2015/088 , G10L2015/223
Abstract: Techniques are described herein for improving performance of machine learning model(s) and thresholds utilized in determining whether automated assistant function(s) are to be initiated. A method includes: receiving, via one or more microphones of a client device, audio data that captures a spoken utterance of a user; processing the audio data using a machine learning model to generate a predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a secondary threshold that is less indicative of the one or more hotwords being present in the audio data than is a primary threshold; in response to determining that the predicted output satisfies the secondary threshold, prompting the user to indicate whether or not the spoken utterance includes a hotword; receiving, from the user, a response to the prompting; and adjusting the primary threshold based on the response.
-
公开(公告)号:US20220148601A1
公开(公告)日:2022-05-12
申请号:US17114118
申请日:2020-12-07
Applicant: GOOGLE LLC
Inventor: Aleks Kracun , Matthew Sharifi
Abstract: Techniques are described herein for multi-factor audio watermarking. A method includes: receiving audio data; processing the audio data to generate predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, processing the audio data using automatic speech recognition to generate a speech transcription feature; detecting a watermark that is embedded in the audio data; and in response to detecting the watermark: determining that the speech transcription feature corresponds to one of a plurality of stored speech transcription features; and in response to determining that the speech transcription feature corresponds to one of the plurality of stored speech transcription features, suppressing processing of a query included in the audio data.
-
公开(公告)号:US12254888B2
公开(公告)日:2025-03-18
申请号:US18373244
申请日:2023-09-26
Applicant: GOOGLE LLC
Inventor: Aleks Kracun , Matthew Sharifi
Abstract: Techniques are described herein for multi-factor audio watermarking. A method includes: receiving audio data; processing the audio data to generate predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, processing the audio data using automatic speech recognition to generate a speech transcription feature; detecting a watermark that is embedded in the audio data; and in response to detecting the watermark: determining that the speech transcription feature corresponds to one of a plurality of stored speech transcription features; and in response to determining that the speech transcription feature corresponds to one of the plurality of stored speech transcription features, suppressing processing of a query included in the audio data.
-
公开(公告)号:US11521604B2
公开(公告)日:2022-12-06
申请号:US17011612
申请日:2020-09-03
Applicant: Google LLC
Inventor: Aleks Kracun , Niranjan Subrahmanya , Aishanee Shah
IPC: G10L15/22 , G10L15/197 , G10L15/06 , G10L15/08
Abstract: Techniques are described herein for improving performance of machine learning model(s) and thresholds utilized in determining whether automated assistant function(s) are to be initiated. A method includes: receiving, via one or more microphones of a client device, audio data that captures a spoken utterance of a user; processing the audio data using a machine learning model to generate a predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a secondary threshold that is less indicative of the one or more hotwords being present in the audio data than is a primary threshold; in response to determining that the predicted output satisfies the secondary threshold, prompting the user to indicate whether or not the spoken utterance includes a hotword; receiving, from the user, a response to the prompting; and adjusting the primary threshold based on the response.
-
公开(公告)号:US20240355324A1
公开(公告)日:2024-10-24
申请号:US18761117
申请日:2024-07-01
Applicant: GOOGLE LLC
Inventor: Aleks Kracun , Niranjan Subrahmanya , Aishanee Shah
IPC: G10L15/197 , G10L15/06 , G10L15/08 , G10L15/22
CPC classification number: G10L15/197 , G10L15/063 , G10L15/22 , G10L2015/088 , G10L2015/223
Abstract: Techniques are described herein for improving performance of machine learning model(s) and thresholds utilized in determining whether automated assistant function(s) are to be initiated. A method includes: receiving, via one or more microphones of a client device, audio data that captures a spoken utterance of a user; processing the audio data using a machine learning model to generate a predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a secondary threshold that is less indicative of the one or more hotwords being present in the audio data than is a primary threshold; in response to determining that the predicted output satisfies the secondary threshold, prompting the user to indicate whether or not the spoken utterance includes a hotword; receiving, from the user, a response to the prompting; and adjusting the primary threshold based on the response.
-
公开(公告)号:US11776549B2
公开(公告)日:2023-10-03
申请号:US17114118
申请日:2020-12-07
Applicant: GOOGLE LLC
Inventor: Aleks Kracun , Matthew Sharifi
Abstract: Techniques are described herein for multi-factor audio watermarking. A method includes: receiving audio data; processing the audio data to generate predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, processing the audio data using automatic speech recognition to generate a speech transcription feature; detecting a watermark that is embedded in the audio data; and in response to detecting the watermark: determining that the speech transcription feature corresponds to one of a plurality of stored speech transcription features; and in response to determining that the speech transcription feature corresponds to one of the plurality of stored speech transcription features, suppressing processing of a query included in the audio data.
-
公开(公告)号:US20220068268A1
公开(公告)日:2022-03-03
申请号:US17011612
申请日:2020-09-03
Applicant: Google LLC
Inventor: Aleks Kracun , Niranjan Subrahmanya , Aishanee Shah
IPC: G10L15/197 , G10L15/06 , G10L15/22
Abstract: Techniques are described herein for improving performance of machine learning model(s) and thresholds utilized in determining whether automated assistant function(s) are to be initiated. A method includes: receiving, via one or more microphones of a client device, audio data that captures a spoken utterance of a user; processing the audio data using a machine learning model to generate a predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a secondary threshold that is less indicative of the one or more hotwords being present in the audio data than is a primary threshold; in response to determining that the predicted output satisfies the secondary threshold, prompting the user to indicate whether or not the spoken utterance includes a hotword; receiving, from the user, a response to the prompting; and adjusting the primary threshold based on the response.
-
-
-
-
-
-
-
-
-