-
公开(公告)号:US20250005063A1
公开(公告)日:2025-01-02
申请号:US18344739
申请日:2023-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Devang Kulshreshtha , Saket Dingliwal , Sravan Babu Bodapati , Katrin Kirchhoff , Sarthak Handa
IPC: G06F16/34 , G06F40/169 , G06F40/40
Abstract: Pairs of text collections are obtained. An individual pair comprises (a) a source text collection which includes a first group of text sequences and (b) an annotated analysis result of the source text collection, comprising a second group of text sequences and a set of evidence mappings generated by an evidence mapping model. An evidence mapping indicates, for a particular text sequence of the second group, another text sequence of the first group which provides evidence for the particular text sequence. A quality metric of the model is obtained using an automated evaluation methodology in which a question is generated from the particular text sequence, and an analysis of a pair of answers (including 10 an answer generated using an evidence mapping) to the question is performed. The quality metric is provided via a programmatic interface.
-
公开(公告)号:US20250029603A1
公开(公告)日:2025-01-23
申请号:US18356116
申请日:2023-07-20
Applicant: Amazon Technologies, Inc.
Inventor: Karthik Gopalakrishnan , Sravan Babu Bodapati , Katrin Kirchhoff , Sarthak Handa
IPC: G10L15/183 , G10L15/06
Abstract: Domain specialty instructions may be generated for performing text analysis tasks. An input text may be received for performing a text analysis task. A domain specialty may be identified for the input text. Specialty domain identifiers may be inserted as part of generating instructions to perform the text analysis task using a pre-trained large language model fine-tuned to a domain that includes multiple domain specialties. The pre-trained large language model may perform the text analysis task on the input text using the generated instructions. A result of the text analysis tsk performed on the input text may be provided.
-
公开(公告)号:US20240331821A1
公开(公告)日:2024-10-03
申请号:US18194350
申请日:2023-03-31
Applicant: Amazon Technologies, Inc.
Inventor: Vijit Gupta , Matthew Chih-Hui Chiou , Amiya Kishor Chakraborty , Anuroop Arora , Varun Sembium Varadarajan , Sarthak Handa , Amit Vithal Sawant , Glen Herschel Carpenter , Jesse Deng , Mohit Narendra Gupta , Rohil Bhattarai , Samuel Benjamin Schiff , Shane Michael McGookey , Tianze Zhang
Abstract: Systems and methods for performing medical audio summarizing for medical conversations are disclosed. An audio file and meta data for a medical conversation are provided to a medical audio summarization system. A transcription machine learning model is used by the medical audio summarization system to generate a transcript and a natural language processing service of the medical audio summarization system is used to generate a summary of the transcript. The natural language processing service may include at least four machine learning models that identify medical entities in the transcript, identify speaker roles in the transcript, determine sections of the transcript corresponding to the summary, and extract or abstract phrases for the summary. The identified medical entities and speaker roles, determined sections, and extracted or abstracted phrases may then be used to generate the summary.
-
4.
公开(公告)号:US20250029612A1
公开(公告)日:2025-01-23
申请号:US18356117
申请日:2023-07-20
Applicant: Amazon Technologies, Inc.
Inventor: Lei Xu , Aparna Elangovan , Rohit Paturi , Sundararajan Srinivasan , Sravan BAbu Bodapati , Katrin Kirchoff , Sarthak Handa
Abstract: Transcript generation as part of automatic speech recognition may be guided using section types. Audio data is received for transcription. An initial transcript of the audio data may be generated and evaluated to determine a section type for the audio data. The section type may then be used to focus generation of a second version of the transcript on one speaker over another speaker.
-
公开(公告)号:US20250005298A1
公开(公告)日:2025-01-02
申请号:US18344742
申请日:2023-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Saket Dingliwal , Karthik Gopalakrishnan , Sravan Babu Bodapati , Sarthak Handa , Katrin Kirchhoff
Abstract: Pairs of text collections are obtained. An individual pair comprises (a) a source text collection which includes a first group of text sequences and (b) an annotated analysis result of the source text collection, comprising a second group of text sequences and a set of evidence mappings generated by an evidence mapping model. An evidence mapping indicates, for a particular text sequence of the second group, another text sequence of the first group which provides evidence for the particular text sequence. A quality metric of the model is obtained using an automated evaluation methodology in which a question is generated from the particular text sequence, and an analysis of a pair of answers (including an answer generated using an evidence mapping) to the question is performed. The quality metric is provided via a programmatic interface.
-
公开(公告)号:US20240428002A1
公开(公告)日:2024-12-26
申请号:US18339749
申请日:2023-06-22
Applicant: Amazon Technologies, Inc.
Inventor: Aparna Elangovan , Lei Xu , Devang Kulshreshtha , Sravan Babu Bodapati , Katrin Kirchhoff , Sarthak Handa
Abstract: A medical audio summarization service receives a medical conversation and an indication of a user preferred summarization style selected from a plurality of available summarization styles to generate a medical summary that conforms to the user preferred summarization style. A transcript is generated via a medical audio transcription service, and the transcript is used by a natural language processing engine (including a large language model) to generate the medical summary. The large language model is trained to be used to generate medical summaries that conform to respective ones of a plurality of user preferred summarization styles. The large language model is trained using training data comprising previously generated summaries and summary interaction metadata generated from user edits and/or feedback.
-
-
-
-
-