-
公开(公告)号:US20240184997A1
公开(公告)日:2024-06-06
申请号:US18278364
申请日:2022-03-08
Applicant: Microsoft Technology Licensing, LLC
Inventor: Linjun SHOU , Ming GONG , Xuanyu BAI , Xuguang WANG , Daxin JIANG
IPC: G06F40/44 , G06F40/284
CPC classification number: G06F40/44 , G06F40/284
Abstract: The present disclosure proposes a method and apparatus for multi-model joint denoising training. Multiple models may be obtained. A set of training samples may be denoised through the multiple models. The multiple models may be trained with the set of denoised training samples.
-
公开(公告)号:US20240184912A1
公开(公告)日:2024-06-06
申请号:US18060921
申请日:2022-12-01
Applicant: PayPal, Inc.
Inventor: Yanfei Dong , Yuan Deng , Soujanya Poria
IPC: G06F21/62 , G06F40/284 , G06F40/295
CPC classification number: G06F21/6245 , G06F40/284 , G06F40/295
Abstract: Techniques are disclosed relating to text sanitization. Given textual data, a computer system identifies tokens predicted to constitute sensitive information. Multi-field data structures (e.g., triplets) are generated for the identified tokens that include questions, answers, and corresponding context. These data structures are supplied to a pre-trained multiple-choice question (MCQ) reading comprehension model. The model outputs, for each data structure, a probability that the question and answer for a given data structure, provided the context, is accurate. A post-processing module can then rank probabilities in this set of probabilities and select the multi-field data structure with the highest probability (in some cases, a programmable threshold must also be met). The selected multi-field data structure is then used to select category information to be used in sanitizing the textual data. In this manner, a piece of sensitive data may be replaced by a label that helps retain interpretability of the sanitized text.
-
73.
公开(公告)号:US20240171609A1
公开(公告)日:2024-05-23
申请号:US17991756
申请日:2022-11-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Mesfin Adane DEMA , Jonathan Ray ARMER , Yafet Kebede TAMENE , Michael David CYR , Eliezer Ali CABRERA MARIN
IPC: H04L9/40 , G06F40/205 , G06F40/284 , G06V30/18
CPC classification number: H04L63/1483 , G06F40/205 , G06F40/284 , G06V30/18
Abstract: Techniques are described herein that are capable of generating a content signature of a textual communication using OCR and text processing. The textual communication is rendered. Text is extracted from the rendered textual communication using OCR. Customization is removed from the text to provide a templatized version of the rendered textual communication that includes de-customized text. The de-customized text is parsed into tokens. Each token includes a respective subset of a plurality of characters. The tokens are converted into respective numbers. Each number is processed using fuzzy hash functions to provide respective hash values associated with the respective token. Representative hash values are selected for the respective fuzzy hash functions by selecting each representative hash value from the hash values that are processed using the respective fuzzy hash function. A content signature of the textual communication is generated by bitwise concatenating at least portions of the respective representative hash values.
-
74.
公开(公告)号:US20240169152A1
公开(公告)日:2024-05-23
申请号:US17993048
申请日:2022-11-23
Applicant: Bank of America Corporation
Inventor: Ramakrishna R. Yannam , Emad Noorizadeh , Rajan Jhaveri , Jennifer Russell
IPC: G06F40/284 , G06F3/01 , G06F40/40
CPC classification number: G06F40/284 , G06F3/017 , G06F40/40
Abstract: Apparatus, methods and systems for contextual prediction processing is provided. Methods may include receiving a conversation from an entity. The conversation may include current utterance, previous utterances and details. Methods may include using an action-topic ontology to build, using data retrieved from the current utterance, a conversation frame that corresponds to the current utterance. Methods may include merging the conversation frame with data, retrieved from the previous utterances and the details, to generate a target conversation frame. Methods may include validating the target conversation frame to prevent looping over historic data in the event that the current utterance fails to add relevant information. Methods may include generating an enhanced contextual utterance based on algorithms and the target conversation frame. The enhanced contextual utterance may be used to understand the current utterance in a context of the conversation. Methods may include returning the enhanced contextual utterance to the entity.
-
75.
公开(公告)号:US20240168983A1
公开(公告)日:2024-05-23
申请号:US17992318
申请日:2022-11-22
Applicant: UST GLOBAL (SINGAPORE) PTE. LTD.
Inventor: Geethi Nair , Alla Abdella , Adnan Masood
IPC: G06F16/34 , G06F16/31 , G06F40/284
CPC classification number: G06F16/345 , G06F16/31 , G06F40/284
Abstract: A domain-agnostic answering system configured to: (a) receive a question and one or more documents; (b) generate summary representations the one or more documents, each summary representation including a summary having one or more sentences and a score vector; (d) determine that a first summary representation of the summary representations is a winning candidate for extracting an answer to the question; (e) match the first summary representation to a first document in the one or more documents to obtain reference indexes of sentences in the first summary representation in portions of the first document; (f) determine a start logit vector and an end logit vector from the question and the matched first summary representation; and (g) generate a start span and an end span from the start logit vector, the end logit vector, and the score vector associated with the first summary representation, the start span and the end span representing the answer to the question.
-
公开(公告)号:US11989514B2
公开(公告)日:2024-05-21
申请号:US17698018
申请日:2022-03-18
Applicant: Capital One Services, LLC
Inventor: Aysu Ezen Can , Zachary S. Brown , Chris Symons
IPC: G06F40/30 , G06F40/166 , G06F40/284 , G10L15/06 , G10L15/16 , G10L15/22 , H04M3/42 , H04M3/51
CPC classification number: G06F40/284 , G06F40/166 , G10L15/063 , G10L15/16 , G10L15/22 , H04M3/42221 , H04M3/5175 , G06F40/30 , H04M2201/40
Abstract: Disclosed herein are system, method, and computer program product embodiments for machine learning systems to process incoming call-center calls to provide communication summaries that capture effort levels of statements made during interactive communications. For a given call, the system receives a transcript as the input and generates a textual summary as the output. In order to improve a call summary and customize a summarization task to a call center domain, the technology disclosed herein may employ a classifier that predicts an effort level and attention score for individual utterances within a call transcript, ranks the attention scores and uses selected ones of the ranked utterances in the summary.
-
公开(公告)号:US20240152962A1
公开(公告)日:2024-05-09
申请号:US18416342
申请日:2024-01-18
Applicant: Lucas J. Myslinski
Inventor: Lucas J. Myslinski
IPC: G06Q30/0251 , G06F16/23 , G06F16/242 , G06F16/33 , G06F16/335 , G06F16/34 , G06F16/9032 , G06F16/9535 , G06F40/226 , G06F40/237 , G06F40/284 , G06F40/30 , G06F40/58
CPC classification number: G06Q30/0255 , G06F16/2365 , G06F16/244 , G06F16/3331 , G06F16/3334 , G06F16/335 , G06F16/345 , G06F16/90332 , G06F16/9535 , G06F40/226 , G06F40/237 , G06F40/284 , G06F40/30 , G06F40/58
Abstract: An optimized fact checking system analyzes and determines the factual accuracy of information and/or characterizes the information by comparing the information with source information. The optimized fact checking system automatically monitors information, processes the information, fact checks the information in an optimized manner and/or provides a status of the information. In some embodiments, the optimized fact checking system generates, aggregates, and/or summarizes content.
-
公开(公告)号:US20240143936A1
公开(公告)日:2024-05-02
申请号:US17978074
申请日:2022-10-31
Applicant: Zoom Video Communications, Inc.
Inventor: Davide Giovanardi , Stephen Muchovej
IPC: G06F40/35 , G06F40/284
CPC classification number: G06F40/35 , G06F40/284
Abstract: Methods and systems provide for extracting next step sentences from a communication session. In one embodiment, the system defines a set of annotation guidelines for labeling training data; receives a set of labeled training data including sentences from a transcript of a communication session, a subset of the sentences being associated with a positive label; organizes the labeled training data and trains a model with the labeled training data, the training including, for each of the sentences, inputting the sentence into a language model and a classification head to output a number of class probabilities, and inputting a classification token representing the sentence into a classification head; using a number of classifiers from the trained model to generate ensemble class scores; and using the ensemble class scores to predict one or more next step sentences from the sentences in the transcript.
-
79.
公开(公告)号:US20240143922A1
公开(公告)日:2024-05-02
申请号:US17986782
申请日:2022-11-14
Applicant: NATIONAL CHENG KUNG UNIVERSITY
Inventor: Wen-Hsiang LU , Chia-Ming TUNG , Ding-Jhe LIOU
IPC: G06F40/284 , G06F40/117 , G06F40/253 , G06F40/35 , G06N5/02
CPC classification number: G06F40/284 , G06F40/117 , G06F40/253 , G06F40/35 , G06N5/022
Abstract: A method of generating knowledge graph, performed by a processing device, includes: obtaining a knowledge document, performing word segmentation and part-of-speech tagging on the knowledge document to generate a number of tagged words, obtaining a number of sentences from the tagged words according to a default sentence pattern, wherein each of the sentences includes a subject, an adverb, a verb and an object, and the adverb corresponding to an adverb type, for each of the sentences, performing: using the subject as a first entity of a triple, using the object as a second entity of the triple, and using the adverb type and the verb as a relation in the triple, and forming a knowledge graph using the triple corresponding to each of the sentences.
-
公开(公告)号:US20240143583A1
公开(公告)日:2024-05-02
申请号:US18486108
申请日:2023-10-12
Applicant: Sumitomo Pharma Co., Ltd.
Inventor: Alan Jeffrey Menaged , Elliott Rain Morelli , Daniel Jaebin Park , Jonathan William Price , Daniel Benjamin Rand , Timothy Cao Tran
IPC: G06F16/242 , G06F16/2457 , G06F16/248 , G06F40/232 , G06F40/284 , G06F40/295 , H04L51/02
CPC classification number: G06F16/243 , G06F16/2457 , G06F16/248 , G06F40/232 , G06F40/284 , G06F40/295 , H04L51/02
Abstract: A system comprising a client device and a server network, the client device configured to receive a query; transmit the query; receive a populated response; and display the populated response. The server network may be configured to receive the query; tokenize the query; generate an intent-match likelihood for each of a plurality of supported question intents; classify the query based on the intent-match likelihoods; evaluate each of the intent-match likelihoods to a confidence threshold; extract, via a Named Entity Recognition model, one or more entities from the query; determine one or more entity values for each of the one or more entities; query the at least one database to incorporate intent to the one or more entities and the one or more entity values; populate a response template with the one or more entity values and one or more specifics; and store the query and the populated response.
-
-
-
-
-
-
-
-
-