DOMAIN SPECIALTY INSTRUCTION GENERATION FOR TEXT ANALYSIS TASKS

    公开(公告)号:US20250029603A1

    公开(公告)日:2025-01-23

    申请号:US18356116

    申请日:2023-07-20

    Abstract: Domain specialty instructions may be generated for performing text analysis tasks. An input text may be received for performing a text analysis task. A domain specialty may be identified for the input text. Specialty domain identifiers may be inserted as part of generating instructions to perform the text analysis task using a pre-trained large language model fine-tuned to a domain that includes multiple domain specialties. The pre-trained large language model may perform the text analysis task on the input text using the generated instructions. A result of the text analysis tsk performed on the input text may be provided.

    Multimodal based punctuation and/or casing prediction

    公开(公告)号:US11580965B1

    公开(公告)日:2023-02-14

    申请号:US16938783

    申请日:2020-07-24

    Abstract: Techniques for predicting punctuation and casing using multimodal fusion are described. An exemplary method includes processing generated text by: tokenizing the generated text into sub-words, and generating a sequence of lexical features for the sub-words using a pre-trained lexical encoder; processing audio of the audio by: generating a sequence of frame level acoustic embeddings using a pre-trained acoustic encoder on the audio, and generating task specific embeddings from the frame level acoustic embeddings; performing multimodal fusion of the sub-word level acoustic embeddings and the sequence of lexical features by: aligning the task specific embeddings to the sequence of lexical features, and combining the sequence of lexical features and aligned acoustic sequence; predicting punctuation and casing from the combined sequence of lexical features and aligned acoustic sequence; concatenating the sub-words of the text, and applying the predicted punctuation and casing; and outputting text having the predicted punctuation and casing.

    AUTOMATED EVALUATION OF EVIDENCE MAPPING MODELS

    公开(公告)号:US20250005063A1

    公开(公告)日:2025-01-02

    申请号:US18344739

    申请日:2023-06-29

    Abstract: Pairs of text collections are obtained. An individual pair comprises (a) a source text collection which includes a first group of text sequences and (b) an annotated analysis result of the source text collection, comprising a second group of text sequences and a set of evidence mappings generated by an evidence mapping model. An evidence mapping indicates, for a particular text sequence of the second group, another text sequence of the first group which provides evidence for the particular text sequence. A quality metric of the model is obtained using an automated evaluation methodology in which a question is generated from the particular text sequence, and an analysis of a pair of answers (including 10 an answer generated using an evidence mapping) to the question is performed. The quality metric is provided via a programmatic interface.

Patent Agency Ranking