-
公开(公告)号:US11710479B1
公开(公告)日:2023-07-25
申请号:US17218813
申请日:2021-03-31
Applicant: Amazon Technologies, Inc.
Inventor: Ashish Vishwanath Shenoy , Sravan Babu Bodapati , Katrin Kirchhoff
IPC: G10L15/183 , G10L15/06 , G10L15/18 , H04L51/02
CPC classification number: G10L15/183 , G10L15/063 , G10L15/1815 , H04L51/02
Abstract: Techniques for implementing a chatbot that utilizes context embeddings are described. An exemplary method includes determining a next turn by: applying a language model to the utterance to determine a probability of a sequence of words, generating a context embedding for the utterance based at least on one or more of: a dialog act as defined by a chatbot definition of the chatbot, a topic vector identifying a domain of the chatbot, a previous chatbot response, and one or more slot options; performing neural language model rescoring using the determined probability of a sequence of words as a word embedding and the generated context embedding to predict an hypothesis; determining at least a name of a slot and type to be fulfilled based at least in part on the hypothesis and the chatbot definition; and determining a next turn based at least in part on the chatbot definition, any previous state, and the name of the slot and type to be fulfilled.
-
公开(公告)号:US11531846B1
公开(公告)日:2022-12-20
申请号:US16587471
申请日:2019-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Sravan Babu Bodapati , Rishita Rajal Anubhai , Pu Paul Zhao , Katrin Kirchhoff
Abstract: Techniques for extending sensitive data tagging without reannotating training data are described. A method for extending sensitive data tagging without reannotating training data may include hosting a plurality of models at a model endpoint in a machine learning service, each model trained to identify a different sensitive data type in a transcript of content, adding a new model to the model endpoint, the new model trained to identify a new sensitive data entity in the transcript of content, identifying sensitive entities in the transcript by each of the plurality of models and the new model, merging inference responses generated by each of the plurality of models and the new model using at least one inference policy, and returning a merged inference response identifying a plurality of sensitive entities in the transcript.
-
公开(公告)号:US20250111850A1
公开(公告)日:2025-04-03
申请号:US18977703
申请日:2024-12-11
Applicant: Amazon Technologies, Inc.
Inventor: John Baker , Anubhav Mishra , Bangrui Liu , Christopher Michael Hittner , Sravan Babu Bodapati , Harshal Pimpalkhute , Katrin Kirchhoff , Anuj Gautam Surana , Yilai Su , Brandon Louis Mendez , Chengshun Zhang
IPC: G10L15/22 , G10L13/027 , G10L15/08
Abstract: A set of alternative vocal input styles for specifying a parameter of a dialog-driven application is determined. During execution of the application, an audio prompt requesting input in one of the styles is presented. A value of the parameter is determined by applying a collection of analysis tools to vocal input obtained after the prompt is presented. A task of the application is initiated using the value.
-
公开(公告)号:US12250180B1
公开(公告)日:2025-03-11
申请号:US17393124
申请日:2021-08-03
Applicant: Amazon Technologies, Inc.
Inventor: Sravan Babu Bodapati , Ashish Vishwanath Shenoy , Monica Lakshmi Sunkara , Katrin Kirchhoff , Anubhav Mishra , Harshal Pimpalkhute , John Baker , Ganesh Kumar Gella
IPC: H04L51/02 , G10L15/197 , G10L15/22
Abstract: Techniques for at least the generation of a chatbot built from a custom vocabulary and to use runtime hints during inference are described. In some examples, the generation of the chatbot includes receiving a request to build a chatbot using a bot definition and a custom vocabulary, wherein the chatbot is to use runtime hints during usage; building the chatbot from the bot definition and custom vocabulary by at least: generating automatic speech recognition (ASR) artifacts to be used in decoding audio input into the chatbot into text for at least one other component of the chatbot to use in determining a next act to be performed, the ASR artifacts including artifacts that use the custom vocabulary and artifacts that do not use the custom vocabulary, and storing the ASR artifacts.
-
公开(公告)号:US20250005298A1
公开(公告)日:2025-01-02
申请号:US18344742
申请日:2023-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Saket Dingliwal , Karthik Gopalakrishnan , Sravan Babu Bodapati , Sarthak Handa , Katrin Kirchhoff
Abstract: Pairs of text collections are obtained. An individual pair comprises (a) a source text collection which includes a first group of text sequences and (b) an annotated analysis result of the source text collection, comprising a second group of text sequences and a set of evidence mappings generated by an evidence mapping model. An evidence mapping indicates, for a particular text sequence of the second group, another text sequence of the first group which provides evidence for the particular text sequence. A quality metric of the model is obtained using an automated evaluation methodology in which a question is generated from the particular text sequence, and an analysis of a pair of answers (including an answer generated using an evidence mapping) to the question is performed. The quality metric is provided via a programmatic interface.
-
公开(公告)号:US20240428002A1
公开(公告)日:2024-12-26
申请号:US18339749
申请日:2023-06-22
Applicant: Amazon Technologies, Inc.
Inventor: Aparna Elangovan , Lei Xu , Devang Kulshreshtha , Sravan Babu Bodapati , Katrin Kirchhoff , Sarthak Handa
Abstract: A medical audio summarization service receives a medical conversation and an indication of a user preferred summarization style selected from a plurality of available summarization styles to generate a medical summary that conforms to the user preferred summarization style. A transcript is generated via a medical audio transcription service, and the transcript is used by a natural language processing engine (including a large language model) to generate the medical summary. The large language model is trained to be used to generate medical summaries that conform to respective ones of a plurality of user preferred summarization styles. The large language model is trained using training data comprising previously generated summaries and summary interaction metadata generated from user edits and/or feedback.
-
公开(公告)号:US11580968B1
公开(公告)日:2023-02-14
申请号:US16455165
申请日:2019-06-27
Applicant: Amazon Technologies, Inc.
Inventor: Arshit Gupta , Peng Zhang , Rashmi Gangadharaiah , Garima Lalwani , Roger Scott Jenke , Hassan Sawaf , Mona Diab , Katrin Kirchhoff , Adel A. Youssef , Kalpesh N. Sutaria
Abstract: Techniques are described for a contextual natural language understanding (cNLU) framework that is able to incorporate contextual signals of variable history length to perform joint intent classification (IC) and slot labeling (SL) tasks. A user utterance provided by a user within a multi-turn chat dialog between the user and a conversational agent is received. The user utterance and contextual information associated with one or more previous turns of the multi-turn chat dialog is provided to a machine learning (ML) model. An intent classification and one or more slot labels for the user utterance are then obtained from the ML model. The cNLU framework described herein thus uses, in addition to a current utterance itself, various contextual signals as input to a model to generate IC and SL predictions for each utterance of a multi-turn chat dialog.
-
公开(公告)号:US11562735B1
公开(公告)日:2023-01-24
申请号:US16836130
申请日:2020-03-31
Applicant: Amazon Technologies, Inc.
Inventor: Arshit Gupta , Julian E. S. Salazar , Peng Zhang , Katrin Kirchhoff , Yi Zhang
IPC: G10L15/18 , G10L15/197 , G10L15/26
Abstract: A spoken language understanding (SLU) system may include an automatic speech recognizer (ASR), an audio feature extractor, an optional synchronizer and a language understanding module. The ASR may produce a first set of input data representing transcripts of utterances. The audio feature extractor may produce a second set of input data representing audio features of the utterances, in particular, non-transcript specific characteristics of the speaker in one or more portions the utterances. The two sets of input data may be provided for the language understanding module to predict intents and slot labels for the utterances. The SLU system may use the optional synchronizer to align the two sets of input data before providing them to the language understanding module.
-
公开(公告)号:US11551695B1
公开(公告)日:2023-01-10
申请号:US15931455
申请日:2020-05-13
Applicant: Amazon Technologies, Inc.
Inventor: Vivek Govindan , Varun Sembium Varadarajan , Christian Egon Berkhoff Dossow , Himalay Mohanlal Joriwal , Sai Madhuri Bhavirisetty , Abhinav Kumar , Orestis Lykouropoulos , Akshay Nalwaya , Rahul Gupta , Sravan Babu Bodapati , Liangwei Guo , Julian E. S. Salazar , Yibin Wang , K P N V D S Siva Rama , Calvin Xuan Li , Mohit Narendra Gupta , Asem Rustum , Katrin Kirchhoff , Pu Zhao
Abstract: A transcription service may receive a request from a developer to build a custom speech-to-text model for a specific domain of speech. The custom speech-to-text model for the specific domain may replace a general speech-to-text model or add to a set of one or more speech-to-text models available for transcribing speech. The transcription service may receive a training data and instructions representing tasks. The transcription service may determine respective schedules for executing the instructions based at least in part on dependencies between the tasks. The transcription service may execute the instructions according to the respective schedules to train a speech-to-text model for a specific domain using the training data set. The transcription service may deploy the trained speech-to-text model as part of a network-accessible service for an end user to convert audio in the specific domain into texts.
-
-
-
-
-
-
-
-