Patent search ap:("Oracle International Corporation") AND inv:"Cong Duy Vu Hoang" Page 2

11.

发明授权
Enhanced logits for natural language processing 有权

公开(公告)号：US11972220B2

公开(公告)日：2024-04-30

申请号：US17456687

申请日：2021-11-29

Applicant: Oracle International Corporation

Inventor： Ying Xu , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G06F40/35 , G06F40/205 , G06F40/253 , G06N3/08 , H04L51/02

CPC classification number: G06F40/35 , G06N3/08 , H04L51/02 , G06F40/205 , G06F40/253

Abstract: Techniques for using enhanced logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system and inputting the utterance into a machine-learning model including a series of network layers. A final network layer of the series of network layers can include a logit function. The machine-learning model can map a first probability for a resolvable class to a first logit value using the logit function. The machine-learning model can map a second probability for a unresolvable class to an enhanced logit value. The method can also include the chatbot system classifying the utterance as the resolvable class or the unresolvable class based on the first logit value and the enhanced logit value.

12.

发明公开
TECHNIQUES FOR AUGMENTING TRAINING DATA FOR AGGREGATION AND SORTING DATABASE OPERATIONS IN A NATURAL LANGUAGE TO DATABASE QUERY SYSTEM 审中-公开

公开(公告)号：US20240061833A1

公开(公告)日：2024-02-22

申请号：US18218385

申请日：2023-07-05

Applicant: Oracle International Corporation

Inventor： Gioacchino Tangari , Nitika Mathur , Philip Arthur , Cong Duy Vu Hoang , Aashna Devang Kanuga , Steve Wai-Chun Siu , Syed Najam Abbas Zaidi , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson

IPC: G06F16/2452 , G06F16/242 , G06F40/247 , G06F40/284

CPC classification number: G06F16/24522 , G06F16/243 , G06F40/247 , G06F40/284

Abstract: Techniques are disclosed for augmenting training data for training a machine learning model to generate database queries. Training data comprising a first training example comprising a first natural language utterance, a logical form for the first natural language utterance, and associated first metadata is obtained. From the first training example, a template utterance is generated. A second natural language utterance is generated by filling slots in the template utterance based on a database schema and database values. Updated metadata is produced based on the first metadata and the second natural language utterance. A second training example is generated, comprising the second natural language utterance, the logical form for the first natural language utterance, and the updated metadata. The training data is augmented by adding the second training example. A machine learning model is trained to generate a database query comprising the database operation using the augmented training data set.

13.

发明公开
LEXICAL DROPOUT FOR NATURAL LANGUAGE PROCESSING 审中-公开

公开(公告)号：US20230206125A1

公开(公告)日：2023-06-29

申请号：US18087647

申请日：2022-12-22

Applicant: Oracle International Corporation

Inventor： Tuyen Quang Pham , Cong Duy Vu Hoang , Thanh Tien Vu , Mark Edward Johnson , Thanh Long Duong

IPC: G06N20/00 , G06F40/35 , G06F40/284 , G06F40/295 , G06F40/253

CPC classification number: G06N20/00 , G06F40/35 , G06F40/284 , G06F40/295 , G06F40/253 , G06F40/205

Abstract: Techniques are provided for improved training of a machine learning model using lexical dropout. A machine learning model and a training data set are accessed. The training data set can include sample utterances and corresponding labels. A dropout parameter is identified. The dropout parameter can indicate a likelihood for dropping out one or more feature vectors for tokens associated with respective entities during training of the machine learning model. The dropout parameter is applied to feature vectors for tokens associated with respective entities. The machine learning model is trained using the training data set and the dropout parameter to generate a trained machine learning model. The use of the trained the machine learning model is facilitated.

14.

发明公开
TRANSFORMING NATURAL LANGUAGE TO STRUCTURED QUERY LANGUAGE BASED ON SCALABLE SEARCH AND CONTENT-BASED SCHEMA LINKING 审中-公开

公开(公告)号：US20230186025A1

公开(公告)日：2023-06-15

申请号：US18065387

申请日：2022-12-13

Applicant: Oracle International Corporation

Inventor： Jae Min John , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Balakota Srinivas Vinnakota , Shivashankar Subramanian , Cong Duy Vu Hoang , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Nitika Mathur , Aashna Devang Kanuga , Philip Arthur , Gioacchino Tangari , Steve Wai-Chun Siu

IPC: G06F40/284 , G06F40/295 , G06F40/42

CPC classification number: G06F40/284 , G06F40/295 , G06F40/42

Abstract: Techniques for preprocessing data assets to be used in a natural language to logical form model based on scalable search and content-based schema linking. In one particular aspect, a method includes accessing an utterance, classifying named entities within the utterance into predefined classes, searching value lists within the database schema using tokens from the utterance to identify and output value matches including: (i) any value within the value lists that matches a token from the utterance and (ii) any attribute associated with a matching value, generating a data structure by organizing and storing: (i) each of the named entities and an assigned class for each of the named entities, (ii) each of the value matches and the token matching each of the value matches, and (iii) the utterance, in a predefined format for the data structure, and outputting the data structure.

15.

发明申请
METHOD AND SYSTEM FOR TARGET BASED HYPER-PARAMETER TUNING 有权

公开(公告)号：US20210304074A1

公开(公告)日：2021-09-30

申请号：US17216498

申请日：2021-03-29

Applicant: Oracle International Corporation

Inventor： Poorya Zaremoodi , Ying Xu , Thanh Tien Vu , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson , Xin Xu , Cong Duy Vu Hoang

IPC: G06N20/00

Abstract: Techniques are disclosed for tuning hyperparameters of a machine-learning model. A plurality of metrics are selected for which hyperparameters of the machine-learning model are to be tuned. Each metric is associated with a plurality of specification parameters including a target score, a penalty factor, and a bonus factor. The plurality of specification parameters are configured for each metric in accordance with a first criterion. The machine-learning model is evaluated using one or more validation datasets to obtain a metric score. A weighted loss function is formulated based on a difference between the metric score and the target score of each metric, the penalty factor or the bonus factor. The hyperparameters associated with the machine-learning model are tuned in order to optimize the weighted loss function. In response to the weighted loss function being optimized, the machine-learning model is provided as a validated machine-learning model.

16.

发明申请
DISTANCE-BASED LOGIT VALUES FOR NATURAL LANGUAGE PROCESSING 有权

公开(公告)号：US20250117591A1

公开(公告)日：2025-04-10

申请号：US18988114

申请日：2024-12-19

Applicant: Oracle International Corporation

Inventor： Ying XU , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G06F40/35 , G06F40/205 , G06F40/253 , G06N20/00 , H04L51/02

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

17.

发明申请
SYSTEM AND TECHNIQUES FOR HANDLING LONG TEXT FOR PRE-TRAINED LANGUAGE MODELS 有权

公开(公告)号：US20250117585A1

公开(公告)日：2025-04-10

申请号：US18987825

申请日：2024-12-19

Applicant: Oracle International Corporation

Inventor： Thanh Tien Vu , Tuyen Quang Pham , Mark Edward Johnson , Thanh Long Duong , Ying Xu , Poorya Zaremoodi , Omid Mohamad Nezami , Budhaditya Saha , Cong Duy Vu Hoang

IPC: G06F40/295 , G06F40/284 , H04L51/02

Abstract: In some aspects, a computing device may receive, at a data processing system, a set of utterances for training or inferencing with a named entity recognizer to assign a label to each token piece from the set of utterances. The computing device may determine a length of each utterance in the set and when the length of the utterance exceeds a pre-determined threshold of token pieces: dividing the utterance into a plurality of overlapping chunks of token pieces; assigning a label together with a confidence score for each token piece in a chunk; determining a final label and an associated confidence score for each chunk of token pieces by merging two confidence scores; determining a final annotated label for the utterance based at least on the merging the two confidence scores; and storing the final annotated label in a memory.

18.

发明申请
MULTI-FEATURE BALANCING FOR NATURAL LANGUAGE PROCESSORS 有权

公开(公告)号：US20240419910A1

公开(公告)日：2024-12-19

申请号：US18819441

申请日：2024-08-29

Applicant: Oracle International Corporation

Inventor： Thanh Long Duong , Vishal Vishnoi , Mark Edward Johnson , Elias Luqman Jalaluddin , Tuyen Quang Pham , Cong Duy Vu Hoang , Poorya Zaremoodi , Srinivasa Phani Kumar Gadde , Aashna Devang Kanuga , Zikai Li , Yuanxu Wu

IPC: G06F40/289 , G06F40/166 , G06F40/205 , G06F40/263 , G06F40/279 , G06F40/295 , G06N3/08 , H04L51/02

Abstract: A method includes receiving an indication of a first coverage value corresponding to a desired overlap between a dataset of natural language phrases and a training dataset for training a machine learning model; determining a second coverage value corresponding to a measured overlap between the dataset of natural language phrases and the training dataset; determining a coverage delta value based on a comparison between the first coverage value and the second coverage value; modifying, based on the coverage delta value, the dataset of natural language phrases; and processing, utilizing a machine learning model including the modified dataset of natural language phrases, an input dataset including a set of input features. The machine learning model processes the input dataset based at least in part on the dataset of natural language phrases to generate an output dataset.

19.

发明公开
OUTPUT INTERPRETATION FOR A MEANING REPRESENTATION LANGUAGE SYSTEM 审中-公开

公开(公告)号：US20240232187A9

公开(公告)日：2024-07-11

申请号：US18321144

申请日：2023-05-22

Applicant: Oracle International Corporation

Inventor： Chang Xu , Poorya Zaremoodi , Cong Duy Vu Hoang , Nitika Mathur , Philip Arthur , Steve Wai-Chun Siu , Aashna Devang Kanuga , Gioacchino Tangari , Mark Edward Johnson , Thanh Long Duong , Vishal Vishnoi , Stephen Andrew McRitchie , Christopher Mark Broadbent

IPC: G06F16/2452 , G06F40/211 , G06F40/30

CPC classification number: G06F16/24522 , G06F40/211 , G06F40/30

Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.

20.

发明公开
DISTANCE-BASED LOGIT VALUE FOR NATURAL LANGUAGE PROCESSING 审中-公开

公开(公告)号：US20240126999A1

公开(公告)日：2024-04-18

申请号：US18545621

申请日：2023-12-19

Applicant: Oracle International Corporation

Inventor： Ying Xu , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G06F40/35 , G06N20/00 , H04L51/02

CPC classification number: G06F40/35 , G06N20/00 , H04L51/02 , G06F40/253

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification