Patent search ap:("Oracle International Corporation") AND inv:"Mark Edward Johnson" Page 10

91.

发明授权
Batching techniques for handling unbalanced training data for a chatbot 有权

公开(公告)号：US12236321B2

公开(公告)日：2025-02-25

申请号：US17217623

申请日：2021-03-30

Applicant: Oracle International Corporation

Inventor： Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi , Balakota Srinivas Vinnakota , Yu-Heng Hong , Elias Luqman Jalaluddin

IPC: G06N20/00 , G06F16/906 , G06F18/22 , G06F18/2413 , G06F40/30 , G10L15/06 , G10L15/18 , G10L15/197 , G10L15/22

Abstract: The present disclosure relates to chatbot systems, and more particularly, to batching techniques for handling unbalanced training data when training a model such that bias is removed from the trained machine learning model when performing inference. In an embodiment, a plurality of raw utterances is obtained. A bias eliminating distribution is determined and a subset of the plurality of raw utterances is batched according to the bias-reducing distribution. The resulting unbiased training data may be input into a prediction model for training the prediction model. The trained prediction model may be obtained and utilized to predict unbiased results from new inputs received by the trained prediction model.

92.

发明授权
Distance-based logit value for natural language processing 有权

公开(公告)号：US12210842B2

公开(公告)日：2025-01-28

申请号：US18545621

申请日：2023-12-19

Applicant: Oracle International Corporation

Inventor： Ying Xu , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G10L15/16 , G06F40/35 , G06N20/00 , H04L51/02 , G06F40/205 , G06F40/253

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

93.

发明授权
Keyword data augmentation tool for natural language processing 有权

公开(公告)号：US12153881B2

公开(公告)日：2024-11-26

申请号：US17452742

申请日：2021-10-28

Applicant: Oracle International Corporation

Inventor： Elias Luqman Jalaluddin , Vishal Vishnoi , Thanh Long Duong , Mark Edward Johnson , Poorya Zaremoodi , Gautam Singaraju , Ying Xu , Vladislav Blinov

IPC: G06F40/279 , G06F40/35 , G06N20/00 , H04L51/02 , G06F40/205 , G06F40/284 , G06F40/289

Abstract: Techniques for keyword data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: identifying keywords within utterances of the training set of utterances, generating a set of OOD examples with the identified keywords, filtering out OOD examples from the set of OOD examples that have a context substantially similar to context of the utterances of the training set of utterances, and incorporating the set of OOD examples without the filtered OOD examples into the training set of utterances to generate an augmented training set of utterances. Thereafter, the machine-learning model is trained using the augmented training set of utterances.

94.

发明授权
Multi-factor modelling for natural language processing 有权

公开(公告)号：US12099816B2

公开(公告)日：2024-09-24

申请号：US17578170

申请日：2022-01-18

Applicant: Oracle International Corporation

Inventor： Elias Luqman Jalaluddin , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Ying Xu

IPC: G06F40/56 , G06F40/295 , G06F40/35 , H04L51/02

CPC classification number: G06F40/56 , G06F40/295 , G06F40/35 , H04L51/02

Abstract: Techniques are disclosed for systems including techniques for multi-factor modelling for training and utilizing chatbot systems for natural language processing. In an embodiment, a method includes receiving a set of utterance data corresponding to a natural language-based query, determining one or more intents for the chatbot corresponds to a possible context for the natural language-based query and associated with a skill for the chatbot, generating one or more intent classification datasets, each intent classification dataset associated with a probability that the natural language query corresponds to an intent of the one or more intents, generating one or more transformed datasets each corresponding to a skill of one or more skills, determining a first skill of the one or more skills based on the one or more transformed datasets and processing, based on the determined first skill, the set of utterance data to resolve the natural language-based query.

95.

发明公开
AUTOMATING LARGE-SCALE DATA COLLECTION 审中-公开

公开(公告)号：US20240169161A1

公开(公告)日：2024-05-23

申请号：US18452803

申请日：2023-08-21

Applicant: Oracle International Corporation

Inventor： Paria Jamshid Lou , Gioacchino Tangari , Jason Black , Bhagya Gayathri Hettige , Xu Zhong , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson

IPC: G06F40/40 , G06F40/284 , G06F40/289 , G10L15/06

CPC classification number: G06F40/40 , G06F40/284 , G06F40/289 , G10L15/063

Abstract: Obtaining collections of sentences in different languages that are usable for training models in various applications of artificial intelligence is provided. A method is provided that obtains, from text corpus, webpages in a plurality of languages, each of the webpages corresponding to an URL; obtains annotations for each of the webpages based on its URL, to obtain annotated data entries corresponding to the webpages, each of the annotated data entries including a classification label corresponding to a sub-topic of one of a plurality of topics, where each of the plurality of topics includes a corresponding plurality of sub-topics; filters the annotated data entries to obtain topic-specific content in a target language based on the classification labels, the topic-specific content corresponding to one or more sub-topics; performs post-processing on the topic-specific content to obtain result data; and outputs the result data for the topic.

96.

发明公开
USING A GENERATIVE ADVERSARIAL NETWORK TO TRAIN A SEMANTIC PARSER OF A DIALOG SYSTEM 审中-公开

公开(公告)号：US20240144923A1

公开(公告)日：2024-05-02

申请号：US18410229

申请日：2024-01-11

Applicant: Oracle International Corporation

Inventor： Thanh Long Duong , Mark Edward Johnson

IPC: G10L15/18 , G06F40/226 , G10L15/16 , G10L15/22

CPC classification number: G10L15/1815 , G06F40/226 , G10L15/16 , G10L15/22 , G10L15/26

Abstract: Disclosed herein are techniques for using a generative adversarial network (GAN) to train a semantic parser of a dialog system. A method described herein involves accessing seed data that includes seed tuples. Each seed tuple includes a respective seed utterance and a respective seed logical form corresponding to the respective seed utterance. The method further includes training a semantic parser and a discriminator in a GAN. The semantic parser learns to map utterances to logical forms based on output from the discriminator, and the discriminator learns to recognize authentic logical forms based on output from the semantic parser. The semantic parser may then be integrated into a dialog system.

97.

发明公开
OUTPUT INTERPRETATION FOR A MEANING REPRESENTATION LANGUAGE SYSTEM 审中-公开

公开(公告)号：US20240134850A1

公开(公告)日：2024-04-25

申请号：US18321144

申请日：2023-05-21

Applicant: Oracle International Corporation

Inventor： Chang Xu , Poorya Zaremoodi , Cong Duy Vu Hoang , Nitika Mathur , Philip Arthur , Steve Wai-Chun Siu , Aashna Devang Kanuga , Gioacchino Tangari , Mark Edward Johnson , Thanh Long Duong , Vishal Vishnoi , Stephen Andrew McRitchie , Christopher Mark Broadbent

IPC: G06F16/2452 , G06F40/211 , G06F40/30

CPC classification number: G06F16/24522 , G06F40/211 , G06F40/30

Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.

98.

发明公开
CONTINUOUS HYPER-PARAMETER TUNING WITH AUTOMATIC DOMAIN WEIGHT ADJUSTMENT BASED ON PERIODIC PERFORMANCE CHECKPOINTS 审中-公开

公开(公告)号：US20240086767A1

公开(公告)日：2024-03-14

申请号：US18295018

申请日：2023-04-03

Applicant: Oracle International Corporation

Inventor： Ying Xu , Vladislav Blinov , Ahmed Ataallah Ataallah Abobakr , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Xin Xu , Elias Luqman Jalaluddin , Umanga Bista

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: Techniques are disclosed herein for continuous hyperparameter tuning with automatic domain weight adjustment based on periodic performance checkpoints. In one aspect, a method is provided that includes initializing a machine learning algorithm with a set of hyperparameter values and obtaining a hyperparameter objective function that is defined at least in part on a plurality of domains of a search space that is associated with the machine learning algorithm. For each trial of a hyperparameter tuning process: running the machine learning algorithm in different domains using the set of hyperparameter values, periodically checking a performance of the machine learning algorithm in the different domains based on the hyperparameter objective function; and continuing hyperparameter tuning with a new set of hyperparameter values after automatically adjusting the domain weights according to a regression status of the different domains. Once the machine learning algorithm has reached convergence, at least one machine learning model is output.

99.

发明授权
Reduced training intent recognition techniques 有权

公开(公告)号：US11914962B2

公开(公告)日：2024-02-27

申请号：US16942535

申请日：2020-07-29

Applicant: Oracle International Corporation

Inventor： Mark Edward Johnson

IPC: G06F40/30 , G06N3/04 , G10L15/06 , G10L15/22 , G06F18/214 , G06F18/2413

CPC classification number: G06F40/30 , G06F18/214 , G06F18/24147 , G06N3/04 , G10L15/063 , G10L15/22 , G10L2015/226

Abstract: The present disclosure relates generally to determining intent based upon speech input using a dialog system. More particularly, techniques are described using matching-based machine learning techniques to identify an intent corresponding to speech input in a dialog system. These procedures do not require training when intents are added or removed from the set of possible intents.

100.

发明公开
CALIBRATING CONFIDENCE SCORES OF A MACHINE LEARNING MODEL TRAINED AS A NATURAL LANGUAGE INTERFACE 审中-公开

公开(公告)号：US20240062021A1

公开(公告)日：2024-02-22

申请号：US18107624

申请日：2023-02-09

Applicant: Oracle International Corporation

Inventor： Gioacchino Tangari , Cong Duy Vu Hoang , Mark Edward Johnson , Poorya Zaremoodi , Nitika Mathur , Aashna Devang Kanuga , Thanh Long Duong

IPC: G06F40/58 , G06F40/253

CPC classification number: G06F40/58 , G06F40/253

Abstract: Techniques are disclosed herein for calibrating confidence scores of a machine learning model trained to translate natural language to a meaning representation language. The techniques include obtaining one or more raw beam scores generated from one or more beam levels of a decoder of a machine learning model trained to translate natural language to a logical form, where each of the one or more raw beam scores is a conditional probability of a sub-tree determined by a heuristic search algorithm of the decoder at one of the one or more beam levels, classifying, by a calibration model, a logical form output by the machine learning model as correct or incorrect based on the one or more raw beam scores, and providing the logical form with a confidence score that is determined based on the classifying of the logical form.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification