-
公开(公告)号:US12182498B1
公开(公告)日:2024-12-31
申请号:US17810302
申请日:2022-06-30
Applicant: Amazon Technologies, Inc.
Inventor: Monica Lakshmi Sunkara , Deepthi Devaiah Devanira , Chaitanya Shivade , Sravan Babu Bodapati , Katrin Kirchhoff , Srikanth Ronanki
IPC: G06F40/166 , G06F21/62 , G06F40/279 , G10L15/16 , G10L15/22
Abstract: Portions of text data generated from inverse text normalization may be redacted. Text data for redaction may be obtained. One or more inverse text normalization models may be applied to the text data to generate normalized text data. A machine learning model, trained to recognize text for redaction, may be applied to identify portions of the normalized text data for redaction. The identified portions may be redacted and the redacted normalized text provided to a destination.
-
公开(公告)号:US12250180B1
公开(公告)日:2025-03-11
申请号:US17393124
申请日:2021-08-03
Applicant: Amazon Technologies, Inc.
Inventor: Sravan Babu Bodapati , Ashish Vishwanath Shenoy , Monica Lakshmi Sunkara , Katrin Kirchhoff , Anubhav Mishra , Harshal Pimpalkhute , John Baker , Ganesh Kumar Gella
IPC: H04L51/02 , G10L15/197 , G10L15/22
Abstract: Techniques for at least the generation of a chatbot built from a custom vocabulary and to use runtime hints during inference are described. In some examples, the generation of the chatbot includes receiving a request to build a chatbot using a bot definition and a custom vocabulary, wherein the chatbot is to use runtime hints during usage; building the chatbot from the bot definition and custom vocabulary by at least: generating automatic speech recognition (ASR) artifacts to be used in decoding audio input into the chatbot into text for at least one other component of the chatbot to use in determining a next act to be performed, the ASR artifacts including artifacts that use the custom vocabulary and artifacts that do not use the custom vocabulary, and storing the ASR artifacts.
-
公开(公告)号:US20250086380A1
公开(公告)日:2025-03-13
申请号:US18957409
申请日:2024-11-22
Applicant: Amazon Technologies, Inc.
Inventor: Monica Lakshmi Sunkara , Deepthi Devaiah Devanira , Chaitanya Shivade , Sravan Babu Bodapati , Katrin Kirchhoff , Srikanth Ronanki
IPC: G06F40/166 , G06F21/62 , G06F40/279 , G10L15/16 , G10L15/22
Abstract: Portions of text data generated from inverse text normalization may be redacted. Text data for redaction may be obtained. One or more inverse text normalization models may be applied to the text data to generate normalized text data. A machine learning model, trained to recognize text for redaction, may be applied to identify portions of the normalized text data for redaction. The identified portions may be redacted and the redacted normalized text provided to a destination.
-
公开(公告)号:US12198681B1
公开(公告)日:2025-01-14
申请号:US17937297
申请日:2022-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Monica Lakshmi Sunkara , Srikanth Ronanki , Sravan Babu Bodapati , Jeffrey John Farris , Katrin Kirchhoff , Vivek Govindan , Yide Zou , Mohit Narendra Gupta , Silviu Mihai Burz
Abstract: Techniques for personalized batch and streaming speech-to-text transcription of audio reduce the error rate of automatic speech recognition (ASR) systems in transcribing rare and out-of-vocabulary words. The techniques achieve personalization of connectionist temporal classification (CT) models by using adaptive boosting to perform biasing at the level of sub-words. In addition to boosting, the techniques encompass a phone alignment network to bias sub-word predictions towards rare long-tail words and out-of-vocabulary words. A technical benefit of the techniques is that the accuracy of speech-to-text transcription of rare and out-of-vocabulary words in a custom vocabulary by automatic speech recognition (ASR) system can be improved without having to train the ASR system on the custom vocabulary. Instead, the techniques allow the same ASR system trained on a base vocabulary to realize the accuracy improvements for different custom vocabularies spanning different domains.
-
公开(公告)号:US11580965B1
公开(公告)日:2023-02-14
申请号:US16938783
申请日:2020-07-24
Applicant: Amazon Technologies, Inc.
Inventor: Monica Lakshmi Sunkara , Srikanth Ronanki , Dhanush Bekal Kannangola , Sravan Babu Bodapati , Katrin Kirchhoff
Abstract: Techniques for predicting punctuation and casing using multimodal fusion are described. An exemplary method includes processing generated text by: tokenizing the generated text into sub-words, and generating a sequence of lexical features for the sub-words using a pre-trained lexical encoder; processing audio of the audio by: generating a sequence of frame level acoustic embeddings using a pre-trained acoustic encoder on the audio, and generating task specific embeddings from the frame level acoustic embeddings; performing multimodal fusion of the sub-word level acoustic embeddings and the sequence of lexical features by: aligning the task specific embeddings to the sequence of lexical features, and combining the sequence of lexical features and aligned acoustic sequence; predicting punctuation and casing from the combined sequence of lexical features and aligned acoustic sequence; concatenating the sub-words of the text, and applying the predicted punctuation and casing; and outputting text having the predicted punctuation and casing.
-
-
-
-