Extending sensitive data tagging without reannotating training data

    公开(公告)号:US11531846B1

    公开(公告)日:2022-12-20

    申请号:US16587471

    申请日:2019-09-30

    Abstract: Techniques for extending sensitive data tagging without reannotating training data are described. A method for extending sensitive data tagging without reannotating training data may include hosting a plurality of models at a model endpoint in a machine learning service, each model trained to identify a different sensitive data type in a transcript of content, adding a new model to the model endpoint, the new model trained to identify a new sensitive data entity in the transcript of content, identifying sensitive entities in the transcript by each of the plurality of models and the new model, merging inference responses generated by each of the plurality of models and the new model using at least one inference policy, and returning a merged inference response identifying a plurality of sensitive entities in the transcript.

    LARGE LANGUAGE MODELS PROVIDING EVIDENCE MAPPINGS FOR GENERATED OUTPUT

    公开(公告)号:US20250005298A1

    公开(公告)日:2025-01-02

    申请号:US18344742

    申请日:2023-06-29

    Abstract: Pairs of text collections are obtained. An individual pair comprises (a) a source text collection which includes a first group of text sequences and (b) an annotated analysis result of the source text collection, comprising a second group of text sequences and a set of evidence mappings generated by an evidence mapping model. An evidence mapping indicates, for a particular text sequence of the second group, another text sequence of the first group which provides evidence for the particular text sequence. A quality metric of the model is obtained using an automated evaluation methodology in which a question is generated from the particular text sequence, and an analysis of a pair of answers (including an answer generated using an evidence mapping) to the question is performed. The quality metric is provided via a programmatic interface.

    MEDICAL CONVERSATION SUMMARIZATION STYLE INTELLIGENCE

    公开(公告)号:US20240428002A1

    公开(公告)日:2024-12-26

    申请号:US18339749

    申请日:2023-06-22

    Abstract: A medical audio summarization service receives a medical conversation and an indication of a user preferred summarization style selected from a plurality of available summarization styles to generate a medical summary that conforms to the user preferred summarization style. A transcript is generated via a medical audio transcription service, and the transcript is used by a natural language processing engine (including a large language model) to generate the medical summary. The large language model is trained to be used to generate medical summaries that conform to respective ones of a plurality of user preferred summarization styles. The large language model is trained using training data comprising previously generated summaries and summary interaction metadata generated from user edits and/or feedback.

    Multi-modal spoken language understanding systems

    公开(公告)号:US11562735B1

    公开(公告)日:2023-01-24

    申请号:US16836130

    申请日:2020-03-31

    Abstract: A spoken language understanding (SLU) system may include an automatic speech recognizer (ASR), an audio feature extractor, an optional synchronizer and a language understanding module. The ASR may produce a first set of input data representing transcripts of utterances. The audio feature extractor may produce a second set of input data representing audio features of the utterances, in particular, non-transcript specific characteristics of the speaker in one or more portions the utterances. The two sets of input data may be provided for the language understanding module to predict intents and slot labels for the utterances. The SLU system may use the optional synchronizer to align the two sets of input data before providing them to the language understanding module.

Patent Agency Ranking