Patent search ap:("Oracle International Corporation") AND inv:"Cong Duy Vu Hoang" Page 4

31.

发明公开
SYSTEM AND METHOD OF SELECTIVE FINE-TUNING FOR CUSTOM TRAINING OF A NATURAL LANGUAGE TO LOGICAL FORM MODEL 审中-公开

公开(公告)号：US20240061835A1

公开(公告)日：2024-02-22

申请号：US18236192

申请日：2023-08-21

Applicant: Oracle International Corporation

Inventor： Shivashankar Subramanian , Gioacchino Tangari , Thanh Tien Vu , Cong Duy Vu Hoang , Poorya Zaremoodi , Dalu Guo , Mark Edward Johnson , Thanh Long Duong

IPC: G06F16/2452 , G06F16/25

CPC classification number: G06F16/24522 , G06F16/252

Abstract: Systems and methods fine-tune a pretrained machine learning model. For a model having multiple layers, an initial set of configurations is identified, each configuration establishing layers to be frozen and layers to be fine-tuned. A configuration that is optimized with respect to one or more parameters is selected, establishing a set of fine-tuning layers and a set of frozen layers. An input for the model is provided to a remote system. An output of the set of frozen layers of the model, given the provided input, is received back and locally stored. The set of fine-tuning layers of the model is loaded from the remote system. The model is fine-tuned by retrieving the locally stored output of the set of frozen layers, and updating weights associated with the set of fine-tuning layers of the machine learning model.

32.

发明公开
DETECTING OUT-OF-DOMAIN, OUT-OF-SCOPE, AND CONFUSION-SPAN (OOCS) INPUT FOR A NATURAL LANGUAGE TO LOGICAL FORM MODEL 审中-公开

公开(公告)号：US20240061834A1

公开(公告)日：2024-02-22

申请号：US18236071

申请日：2023-08-21

Applicant: Oracle International Corporation

Inventor： Gioacchino Tangari , Cong Duy Vu Hoang , Poorya Zaremoodi , Philip Arthur , Nitika Mathur , Mark Edward Johnson , Thanh Long Duong

IPC: G06F16/2452 , G06F16/25

CPC classification number: G06F16/24522 , G06F16/252

Abstract: Systems and methods identify whether an input utterance is suitable for providing to a machine learning model configured to generate a query for a database. Techniques include generating an input string by concatenating a natural language utterance with a database schema representation for a database; providing the input string to a first machine learning model; based on the input string, generating, by the first machine learning model, a score indicating whether the natural language utterance is translatable to a database query for the database and should be routed to a second machine learning model, the second machine learning model configured to generate a query for the database based on the natural language utterance; comparing the score to a threshold value; and responsive to determining that the score exceeds the threshold value, providing the natural language utterance or the input string to the second machine learning model.

33.

发明公开
TECHNIQUES FOR CONVERTING A NATURAL LANGUAGE UTTERANCE TO AN INTERMEDIATE DATABASE QUERY REPRESENTATION 审中-公开

公开(公告)号：US20240061832A1

公开(公告)日：2024-02-22

申请号：US18209844

申请日：2023-06-14

Applicant: Oracle International Corporation

Inventor： Cong Duy Vu Hoang , Stephen Andrew McRitchie , Mark Edward Johnson , Shivashankar Subramanian , Aashna Devang Kanuga , Nitika Mathur , Gioacchino Tangari , Steve Wai-Chun Siu , Poorya Zaremoodi , Vasisht Raghavendra , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Christopher Mark Broadbent , Philip Arthur , Syed Najam Abbas Zaidi

IPC: G06F16/2452 , G06F16/2455 , G06F16/242

CPC classification number: G06F16/24522 , G06F16/24561 , G06F16/2433

Abstract: Techniques are disclosed herein for converting a natural language utterance to an intermediate database query representation. An input string is generated by concatenating a natural language utterance with a database schema representation for a database. Based on the input string, a first encoder generates one or more embeddings of the natural language utterance and the database schema representation. A second encoder encodes relations between elements in the database schema representation and words in the natural language utterance based on the one or more embeddings. A grammar-based decoder generates an intermediate database query representation based on the encoded relations and the one or more embeddings. Based on the intermediate database query representation and an interface specification, a database query is generated in a database query language.

34.

发明授权
Context tag integration with named entity recognition models 有权

公开(公告)号：US11868727B2

公开(公告)日：2024-01-09

申请号：US17648376

申请日：2022-01-19

Applicant: Oracle International Corporation

Inventor： Duy Vu , Tuyen Quang Pham , Cong Duy Vu Hoang , Srinivasa Phani Kumar Gadde , Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi

IPC: G06F40/295 , G06F40/205 , G06V30/19 , G06F40/40 , G06F40/35 , G06F40/279

CPC classification number: G06F40/295 , G06F40/205 , G06F40/279 , G06F40/35 , G06F40/40 , G06V30/19147

Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.

35.

发明公开
DATA MANUFACTURING FRAMEWORKS FOR SYNTHESIZING SYNTHETIC TRAINING DATA TO FACILITATE TRAINING A NATURAL LANGUAGE TO LOGICAL FORM MODEL 审中-公开

公开(公告)号：US20230186026A1

公开(公告)日：2023-06-15

申请号：US18065406

申请日：2022-12-13

Applicant: Oracle International Corporation

Inventor： Philip Arthur , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Balakota Srinivas Vinnakota , Cong Duy Vu Hoang , Steve Wai-Chun Siu , Nitika Mathur , Gioacchino Tangari , Aashna Devang Kanuga

IPC: G06F40/284 , G06F40/211 , G06F40/40 , G06F16/2452 , G06N20/00

CPC classification number: G06F40/284 , G06F40/211 , G06F40/40 , G06F16/24522 , G06N20/00

Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.

36.

发明公开
TRANSFORMING NATURAL LANGUAGE TO STRUCTURED QUERY LANGUAGE BASED ON MULTI-TASK LEARNING AND JOINT TRAINING 审中-公开

公开(公告)号：US20230185799A1

公开(公告)日：2023-06-15

申请号：US18065374

申请日：2022-12-13

Applicant: Oracle International Corporation

Inventor： Cong Duy Vu Hoang , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Balakota Srinivas Vinnakota

IPC: G06F16/2452 , G06F40/35 , G06F40/205 , G06F40/58 , G06N20/00

CPC classification number: G06F16/24522 , G06F40/35 , G06F40/205 , G06F40/58 , G06N20/00 , G06F40/263

Abstract: Techniques are disclosed for training a model, using multi-task learning, to transform natural language to a logical form. In one particular aspect, a method includes accessing a first set of utterances that have non-follow-up utterances and a second set of utterances that have initial utterances and associated one or more follow-up utterances and training a model for translating an utterance to a logical form. The training is a joint training process that includes calculating a first loss for a first semantic parsing task based on one or more non-follow-up utterances from the first set of utterances, calculating a second loss for a second semantic parsing task based on one or more initial utterances and associated one or more follow-up utterances from the second set of utterances, combining the first and second losses to obtain a final loss, and updating model parameters of the model based on the final loss.

37.

发明申请
FRAMEWORK FOR FOCUSED TRAINING OF LANGUAGE MODELS AND TECHNIQUES FOR END-TO-END HYPERTUNING OF THE FRAMEWORK 有权

公开(公告)号：US20230098783A1

公开(公告)日：2023-03-30

申请号：US17952116

申请日：2022-09-23

Applicant: Oracle International Corporation

Inventor： Poorya Zaremoodi , Cong Duy Vu Hoang , Duy Vu , Dai Hoang Tran , Budhaditya Saha , Nagaraj N. Bhat , Thanh Tien Vu , Tuyen Quang Pham , Adam Craig Pocock , Katherine Silverstein , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong

IPC: G10L15/06 , G10L15/183

Abstract: Techniques are disclosed herein for focused training of language models and end-to-end hypertuning of the framework. In one aspect, a method is provided that includes obtaining a machine learning model pre-trained for language modeling, and post-training the machine learning model for various tasks to generate a focused machine learning model. The post-training includes: (i) training the machine learning model on an unlabeled set of training data pertaining to a task that the machine learning model was pre-trained for as part of the language modeling, and the unlabeled set of training data is obtained with respect to a target domain, a target task, or a target language, and (ii) training the machine learning model on a labeled set of training data that pertains to another task that is an auxiliary task related to a downstream task to be performed using the machine learning model or output from the machine learning model.

38.

发明申请
METHOD AND SYSTEM FOR CONSTRAINT BASED HYPERPARAMETER TUNING 有权

公开(公告)号：US20210304003A1

公开(公告)日：2021-09-30

申请号：US17216496

申请日：2021-03-29

Applicant: Oracle International Corporation

Inventor： Mark Edward Johnson , Thanh Long Duong , Vishal Vishnoi , Balakota Srinivas Vinnakota , Tuyen Quang Pham , Cong Duy Vu Hoang

IPC: G06N3/08 , G06K9/62 , H04L12/58

Abstract: Techniques are disclosed for tuning hyperparameters of a model. Datasets are obtained for training the model and metrics are selected for evaluating performance of the model. Each metric is assigned a weight specifying an importance to the performance of the model. A function is created that measures performance based on the weighted metrics. Hyperparameters are tuned to optimize the model performance. Tuning the hyperparameters includes: (i) training the model that is configured based on a current values for the hyperparameters; (ii) evaluating a performance of the model using the function; (iii) determining whether the model is optimized for the metrics; (iv) in response to the model not being optimized, searching for a new values for the hyperparameters, reconfiguring the model with the new values, and repeating steps (i)-(iii) using the reconfigured model; and (v) in response to the model being optimized for the metrics, providing a trained model.

39.

发明申请
MODEL ROBUSTNESS ON OPERATORS AND TRIGGERING KEYWORDS IN NATURAL LANGUAGE TO A MEANING REPRESENTATION LANGUAGE SYSTEM 有权

公开(公告)号：US20250156649A1

公开(公告)日：2025-05-15

申请号：US18505498

申请日：2023-11-09

Applicant: Oracle International Corporation

Inventor： Gioacchino Tangari , Chang Xu , Nitika Mathur , Philip Arthur , Syed Najam Abbas Zaidi , Aashna Devang Kanuga , Cong Duy Vu Hoang , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi

IPC: G06F40/40 , G06F40/211 , G06F40/284

Abstract: Techniques are disclosed herein for improving model robustness on operators and triggering keywords in natural language to a meaning representation language system. The techniques include augmenting an original set of training data for a target robustness bucket by leveraging a combination of two training data generation techniques: (1) modification of existing training examples and (2) synthetic template-based example generation. The resulting set of augmented data examples from the two training data generation techniques are appended to the original set of training data to generate an augmented training data set and the augmented training data set is used to train a machine learning model to generate logical forms for utterances.

40.

发明申请
TECHNIQUES FOR TRANSFORMING NATURAL LANGUAGE CONVERSATION INTO A VISUALIZATION REPRESENTATION 有权

公开(公告)号：US20250068627A1

公开(公告)日：2025-02-27

申请号：US18616801

申请日：2024-03-26

Applicant: Oracle International Corporation

Inventor： Cong Duy Vu Hoang , Gioacchino Tangari , Stephen Andrew McRitchie , Nitika Mathur , Aashna Devang Kanuga , Steve Wai-Chun Siu , Dalu Guo , Chang Xu , Mark Edward Johnson , Christopher Mark Broadbent , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Chandan Basavaraju , Kenneth Khiaw Hong Eng

IPC: G06F16/2452 , G06F16/2457 , G06F16/28

Abstract: Techniques are disclosed herein for transforming natural language conversations into a visual output. In one aspect, a computer-implement method includes generating an input string by concatenating a natural language utterance with a schema representation comprising a set of entities for visualization actions, generating, by a first encoder of a machine learning model, one or more embeddings of the input string, encoding, by a second encoder of the machine learning model, relations between elements in the schema representation and words in the natural language utterance based on the one or more embeddings, generating, by a grammar-based decoder of the machine learning model and based on the encoded relations and the one or more embeddings, an intermediate logical form that represents at least the query, the one or more visualization actions, or the combination thereof, and generating, based on the intermediate logical form, a command for a computing system.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification