Patent search ap:("GOOGLE LLC") AND inv:"Dirk Padfield" Page 1

1.

发明授权
False suggestion detection for user-provided content 有权

公开(公告)号：US12254874B2

公开(公告)日：2025-03-18

申请号：US17676170

申请日：2022-02-20

Applicant: GOOGLE LLC

Inventor： Dirk Padfield , Noah Murad , Edward Lo , Bryan Huh

IPC: G10L15/187 , G06F40/166 , G10L15/02 , G10L15/06 , G10L15/22

Abstract: An automated speech recognition (ASR) transcript of at least a portion of a media content is obtained from an ASR tool. Suggested words are received for corrected words of the ASR transcript of the media content. Features are obtained using at least the suggested words or the corrected words. The features include features relating to sound similarities between the suggested words and the corrected words. The features are input into a machine learning (ML) model to obtain a determination regarding a validity of the suggested words. Responsive to the suggested words constituting a valid suggestion, the suggested words are incorporated into the ASR transcript. At least a portion of the ASR transcript is transmitted to a user device in conjunction with at least a portion of the media content.

2.

发明公开
False Suggestion Detection for User-Provided Content 审中-公开

公开(公告)号：US20230267926A1

公开(公告)日：2023-08-24

申请号：US17676170

申请日：2022-02-20

Applicant: GOOGLE LLC

Inventor： Dirk Padfield , Noah Murad , Edward Lo , Bryan Huh

IPC: G10L15/187 , G06F40/166 , G10L15/22 , G10L15/02 , G10L15/06

CPC classification number: G10L15/187 , G06F40/166 , G10L15/22 , G10L15/02 , G10L15/063 , G10L2015/025

Abstract: An automated speech recognition (ASR) transcript of at least a portion of a media content is obtained from an ASR tool. Suggested words are received for corrected words of the ASR transcript of the media content. Features are obtained using at least the suggested words or the corrected words. The features include features relating to sound similarities between the suggested words and the corrected words. The features are input into a machine learning (ML) model to obtain a determination regarding a validity of the suggested words. Responsive to the suggested words constituting a valid suggestion, the suggested words are incorporated into the ASR transcript. At least a portion of the ASR transcript is transmitted to a user device in conjunction with at least a portion of the media content.

3.

发明授权
Adaptive diarization model and user interface 有权

公开(公告)号：US11710496B2

公开(公告)日：2023-07-25

申请号：US17596861

申请日：2019-07-01

Applicant: Google LLC

Inventor： Aaron Donsbach , Dirk Padfield

IPC: G10L15/26 , G10L15/08 , G10L21/0308 , G06F3/0481 , G06F3/16 , G10L17/06 , G10L17/24 , G10L21/028

CPC classification number: G10L21/0308 , G06F3/0481 , G06F3/167 , G10L17/06 , G10L17/24 , G10L21/028

Abstract: A computing device receives a first audio waveform representing a first utterance and a second utterance. The computing device receives identity data indicating that the first utterance corresponds to a first speaker and the second utterance corresponds to a second speaker. The computing device determines, based on the first utterance, the second utterance, and the identity data, a diarization model configured to distinguish between utterances by the first speaker and utterances by the second speaker. The computing device receives, exclusively of receiving further identity data indicating a source speaker of a third utterance, a second audio waveform representing the third utterance. The computing device determines, by way of the diarization model and independently of the further identity data of the first type, the source speaker of the third utterance. The computing device updates the diarization model based on the third utterance and the determined source speaker.

4.

发明申请
Adaptive Diarization Model and User Interface 有权

公开(公告)号：US20220310109A1

公开(公告)日：2022-09-29

申请号：US17596861

申请日：2019-07-01

Applicant: Google LLC

Inventor： Aaron Donsbach , Dirk Padfield

IPC: G10L21/0308 , G10L21/028 , G10L17/06 , G10L17/24 , G06F3/16 , G06F3/0481

Abstract: A computing device receives a first audio waveform representing a first utterance and a second utterance. The computing device receives identity data indicating that the first utterance corresponds to a first speaker and the second utterance corresponds to a second speaker. The computing device determines, based on the first utterance, the second utterance, and the identity data, a diarization model configured to distinguish between utterances by the first speaker and utterances by the second speaker. The computing device receives, exclusively of receiving further identity data indicating a source speaker of a third utterance, a second audio waveform representing the third utterance. The computing device determines, by way of the diarization model and independently of the further identity data of the first type, the source speaker of the third utterance. The computing device updates the diarization model based on the third utterance and the determined source speaker.

Patent Agency Ranking