Patent search ap:("Google LLC") AND inv:"Charles Caleb Peyser" Page 1

1.

发明申请
End-To-End Automated Speech Recognition on Numeric Sequences 审中-公开

公开(公告)号：US20200349922A1

公开(公告)日：2020-11-05

申请号：US16830996

申请日：2020-03-26

Applicant: Google LLC

Inventor： Charles Caleb Peyser , Hao Zhang , Tara N. Sainath , Zelin Wu

IPC: G10L15/06 , G10L15/16 , G10L15/22 , G10L15/197 , G10L13/04 , G06N3/08

Abstract: A method for generating final transcriptions representing numerical sequences of utterances in a written domain includes receiving audio data for an utterance containing a numeric sequence, and decoding, using a sequence-to-sequence speech recognition model, the audio data for the utterance to generate, as output from the sequence-to-sequence speech recognition model, an intermediate transcription of the utterance. The method also includes processing, using a neural corrector/denormer, the intermediate transcription to generate a final transcription that represents the numeric sequence of the utterance in a written domain. The neural corrector/denormer is trained on a set of training samples, where each training sample includes a speech recognition hypothesis for a training utterance and a ground-truth transcription of the training utterance. The ground-truth transcription of the training utterance is in the written domain. The method also includes providing the final transcription representing the numeric sequence of the utterance in the written domain for output.

2.

发明公开
PROPER NOUN RECOGNITION IN END-TO-END SPEECH RECOGNITION 审中-公开

公开(公告)号：US20230377564A1

公开(公告)日：2023-11-23

申请号：US18362273

申请日：2023-07-31

Applicant: Google LLC

Inventor： Charles Caleb Peyser , Tara N. Sainath , Golan Pundak

IPC: G10L15/06 , G06N3/049 , G10L15/16 , G10L15/18 , G10L15/187

CPC classification number: G10L15/063 , G06N3/049 , G10L15/16 , G10L15/1815 , G10L15/187

Abstract: A method for training a speech recognition model with a minimum word error rate loss function includes receiving a training example comprising a proper noun and generating a plurality of hypotheses corresponding to the training example. Each hypothesis of the plurality of hypotheses represents the proper noun and includes a corresponding probability that indicates a likelihood that the hypothesis represents the proper noun. The method also includes determining that the corresponding probability associated with one of the plurality of hypotheses satisfies a penalty criteria. The penalty criteria indicating that the corresponding probability satisfies a probability threshold, and the associated hypothesis incorrectly represents the proper noun. The method also includes applying a penalty to the minimum word error rate loss function.

3.

发明公开
Rare Word Recognition with LM-aware MWER Training 审中-公开

公开(公告)号：US20230298570A1

公开(公告)日：2023-09-21

申请号：US18187222

申请日：2023-03-21

Applicant: Google LLC

Inventor： Weiran Wang , Tongzhou Chen , Tara N. Sainath , Ehsan Variani , Rohit Prakash Prabhavalkar , Ronny Huang , Bhuvana Ramabhadran , Neeraj Gaur , Sepand Mavandadi , Charles Caleb Peyser , Trevor Strohman , Yangzhang He , David Rybach

IPC: G10L15/06 , G10L15/19 , G10L15/22 , G10L15/16 , G10L15/02

CPC classification number: G10L15/063 , G10L15/19 , G10L15/22 , G10L15/16 , G10L15/02

Abstract: A method includes generating, using an audio encoder, a higher-order feature representation for each acoustic frame in a sequence of acoustic frames; generating, using a decoder, based on the higher-order feature representation, a plurality of speech recognition hypotheses, each hypotheses corresponding to a candidate transcription of an utterance and having an associated first likelihood score; generating, using an external language model, for each speech recognition hypothesis, a second likelihood score; determining, using a learnable fusion module, for each speech recognition hypothesis, a set of fusion weights based on the higher-order feature representation and the speech recognition hypothesis; and generating, using the learnable fusion module, for each speech recognition hypothesis, a third likelihood score based on the first likelihood score, the second likelihood score, and the set of fusion weights, the audio encoder and decoder trained using minimum additive error rate training in the presence of the external language model.

4.

发明申请
Proper Noun Recognition in End-to-End Speech Recognition 有权

公开(公告)号：US20210233512A1

公开(公告)日：2021-07-29

申请号：US17150491

申请日：2021-01-15

Applicant: Google LLC

Inventor： Charles Caleb Peyser , Tara N. Sainath , Golan Pundak

IPC: G10L15/06 , G06N3/04 , G10L15/16 , G10L15/18 , G10L15/187

Abstract: A method for training a speech recognition model with a minimum word error rate loss function includes receiving a training example comprising a proper noun and generating a plurality of hypotheses corresponding to the training example. Each hypothesis of the plurality of hypotheses represents the proper noun and includes a corresponding probability that indicates a likelihood that the hypothesis represents the proper noun. The method also includes determining that the corresponding probability associated with one of the plurality of hypotheses satisfies a penalty criteria. The penalty criteria indicating that the corresponding probability satisfies a probability threshold, and the associated hypothesis incorrectly represents the proper noun. The method also includes applying a penalty to the minimum word error rate loss function.

5.

发明公开
Joint Segmenting and Automatic Speech Recognition 审中-公开

公开(公告)号：US20230343332A1

公开(公告)日：2023-10-26

申请号：US18304064

申请日：2023-04-20

Applicant: Google LLC

Inventor： Ronny Huang , Shuo-yiin Chang , David Rybach , Rohit Prakash Prabhavalkar , Tara N. Sainath , Cyril Allauzen , Charles Caleb Peyser , Zhiyun Lu

IPC: G10L15/04 , G10L25/93 , G10L15/197 , G10L15/06 , G10L15/22 , G10L15/02

CPC classification number: G10L15/197 , G10L15/02 , G10L15/04 , G10L15/063 , G10L15/22 , G10L25/93 , G10L2015/025 , G10L2025/932

Abstract: A joint segmenting and ASR model includes an encoder and decoder. The encoder configured to: receive a sequence of acoustic frames characterizing one or more utterances; and generate, at each output step, a higher order feature representation for a corresponding acoustic frame. The decoder configured to: receive the higher order feature representation and generate, at each output step: a probability distribution over possible speech recognition hypotheses, and an indication of whether the corresponding output step corresponds to an end of speech segment. The j oint segmenting and ASR model trained on a set of training samples, each training sample including: audio data characterizing a spoken utterance; and a corresponding transcription of the spoken utterance, the corresponding transcription having an end of speech segment ground truth token inserted into the corresponding transcription automatically based on a set of heuristic-based rules and exceptions applied to the training sample.

6.

发明授权
Proper noun recognition in end-to-end speech recognition 有权

公开(公告)号：US11749259B2

公开(公告)日：2023-09-05

申请号：US17150491

申请日：2021-01-15

Applicant: Google LLC

Inventor： Charles Caleb Peyser , Tara N. Sainath , Golan Pundak

IPC: G10L15/06 , G10L15/16 , G10L15/18 , G10L15/187 , G06N3/049

CPC classification number: G10L15/063 , G06N3/049 , G10L15/16 , G10L15/187 , G10L15/1815

Abstract: A method for training a speech recognition model with a minimum word error rate loss function includes receiving a training example comprising a proper noun and generating a plurality of hypotheses corresponding to the training example. Each hypothesis of the plurality of hypotheses represents the proper noun and includes a corresponding probability that indicates a likelihood that the hypothesis represents the proper noun. The method also includes determining that the corresponding probability associated with one of the plurality of hypotheses satisfies a penalty criteria. The penalty criteria indicating that the corresponding probability satisfies a probability threshold, and the associated hypothesis incorrectly represents the proper noun. The method also includes applying a penalty to the minimum word error rate loss function.

7.

发明授权
End-to-end automated speech recognition on numeric sequences 有权

公开(公告)号：US11367432B2

公开(公告)日：2022-06-21

申请号：US16830996

申请日：2020-03-26

Applicant: Google LLC

Inventor： Charles Caleb Peyser , Hao Zhang , Tara N. Sainath , Zelin Wu

IPC: G10L15/22 , G10L15/26 , G10L15/06 , G06N3/08 , G10L13/00 , G10L15/16 , G10L15/197

Abstract: A method for generating final transcriptions representing numerical sequences of utterances in a written domain includes receiving audio data for an utterance containing a numeric sequence, and decoding, using a sequence-to-sequence speech recognition model, the audio data for the utterance to generate, as output from the sequence-to-sequence speech recognition model, an intermediate transcription of the utterance. The method also includes processing, using a neural corrector/denormer, the intermediate transcription to generate a final transcription that represents the numeric sequence of the utterance in a written domain. The neural corrector/denormer is trained on a set of training samples, where each training sample includes a speech recognition hypothesis for a training utterance and a ground-truth transcription of the training utterance. The ground-truth transcription of the training utterance is in the written domain. The method also includes providing the final transcription representing the numeric sequence of the utterance in the written domain for output.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification