Patent search ap:("Oracle International Corporation") AND inv:"Cong Duy Vu Hoang" Page 3

21.

发明公开
CONTEXT TAG INTEGRATION WITH NAMED ENTITY RECOGNITION MODELS 审中-公开

公开(公告)号：US20240095454A1

公开(公告)日：2024-03-21

申请号：US18521805

申请日：2023-11-28

Applicant: Oracle International Corporation

Inventor： Duy Vu , Tuyen Quang Pham , Cong Duy Vu Hoang , Srinivasa Phani Kumar Gadde , Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi

IPC: G06F40/295 , G06F40/205 , G06F40/279 , G06F40/35 , G06F40/40 , G06V30/19

CPC classification number: G06F40/295 , G06F40/205 , G06F40/279 , G06F40/35 , G06F40/40 , G06V30/19147

Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.

22.

发明公开
TECHNIQUES FOR OUT-OF-DOMAIN (OOD) DETECTION 审中-公开

公开(公告)号：US20230376696A1

公开(公告)日：2023-11-23

申请号：US18364298

申请日：2023-08-02

Applicant: Oracle International Corporation

Inventor： Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi , Crystal C. Pan , Vladislav Blinov , Cong Duy Vu Hoang , Elias Luqman Jalaluddin , Duy Vu , Balakota Srinivas Vinnakota

IPC: G06F40/30 , G06N20/00 , G06F40/289 , H04L51/02

CPC classification number: G06F40/30 , G06N20/00 , G06F40/289 , H04L51/02 , G06F40/205

Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances. One particular technique includes receiving an utterance and a target domain of a chatbot, generating a sentence embedding for the utterance, obtaining an embedding representation for each cluster of in-domain utterances associated with the target domain, predicting, using a metric learning model, a first probability that the utterance belongs to the target domain based on a similarity or difference between the sentence embedding and each embedding representation for each cluster, predicting, using an outlier detection model, a second probability that the utterance belongs to the target domain based on a determined distance or density deviation between the sentence embedding and embedding representations for neighboring clusters, evaluating the first probability and the second probability to determine a final probability, and classifying the utterance as in-domain or out-of-domain for the chatbot based on the final probability.

23.

发明公开
DATA MANUFACTURING FRAMEWORKS FOR SYNTHESIZING SYNTHETIC TRAINING DATA TO FACILITATE TRAINING A NATURAL LANGUAGE TO LOGICAL FORM MODEL 审中-公开

公开(公告)号：US20230186161A1

公开(公告)日：2023-06-15

申请号：US18065422

申请日：2022-12-13

Applicant: Oracle International Corporation

Inventor： Philip Arthur , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Balakota Srinivas Vinnakota , Cong Duy Vu Hoang , Steve Wai-Chun Siu , Nitika Mathur , Gioacchino Tangari , Aashna Devang Kanuga

IPC: G06N20/00 , G06F40/58 , G06F40/284 , G06F40/237

CPC classification number: G06N20/00 , G06F40/58 , G06F40/284 , G06F40/237 , G06F40/35

Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.

24.

发明公开
WIDE AND DEEP NETWORK FOR LANGUAGE DETECTION USING HASH EMBEDDINGS 审中-公开

公开(公告)号：US20230141853A1

公开(公告)日：2023-05-11

申请号：US18052694

申请日：2022-11-04

Applicant: Oracle International Corporation

Inventor： Thanh Tien Vu , Poorya Zaremoodi , Duy Vu , Mark Edward Johnson , Thanh Long Duong , Xu Zhong , Vladislav Blinov , Cong Duy Vu Hoang , Yu-Heng Hong , Vinamr Goel , Philip Victor Ogren , Srinivasa Phani Kumar Gadde , Vishal Vishnoi

IPC: G06F40/263 , G06F16/31

CPC classification number: G06F40/263 , G06F16/325 , H04L51/02

Abstract: Techniques disclosed herein relate generally to language detection. In one particular aspect, a method is provided that includes obtaining a sequence of n-grams of a textual unit; using an embedding layer to obtain an ordered plurality of embedding vectors for the sequence of n-grams; using a deep network to obtain an encoded vector that is based on the ordered plurality of embedding vectors; and using a classifier to obtain a language prediction for the textual unit that is based on the encoded vector. The deep network includes an attention mechanism, and using the embedding layer to obtain the ordered plurality of embedding vectors comprises, for each n-gram in the sequence of n-grams: obtaining hash values for the n-gram; based on the hash values, selecting component vectors from among the plurality of component vectors; and obtaining an embedding vector for the n-gram that is based on the component vectors.

25.

发明申请
FINE-TUNING MULTI-HEAD NETWORK FROM A SINGLE TRANSFORMER LAYER OF PRE-TRAINED LANGUAGE MODEL 有权

公开(公告)号：US20230115321A1

公开(公告)日：2023-04-13

申请号：US17735651

申请日：2022-05-03

Applicant: Oracle International Corporation

Inventor： Thanh Tien Vu , Tuyen Quang Pham , Omid Mohamad Nezami , Mark Edward Johnson , Thanh Long Duong , Cong Duy Vu Hoang

IPC: G10L15/06 , G06F40/20 , G10L15/22 , G06N20/00

Abstract: Techniques are provided for customizing or fine-tuning a pre-trained version of a machine-learning model that includes multiple layers and is configured to process audio or textual language input. Each of the multiple layers is configured with a plurality of layer-specific pre-trained parameter values corresponding to a plurality of parameters, and each of the multiple layers is configured to implement multi-head attention. An incomplete subset of the multiple layers is identified for which corresponding layer-specific pre-trained parameter values are to be fine-tuned using a client data set. The machine-learning model is fine-tuned using the client data set to generate an updated version of the machine-learning model, where the layer-specific pre-trained parameter values configured for each layer of one of more of the multiple layers not included in the incomplete subset are frozen during the fine-tuning. Use of the updated version of the machine-learning model is facilitated.

26.

发明申请
CONTEXT TAG INTEGRATION WITH NAMED ENTITY RECOGNITION MODELS 有权

公开(公告)号：US20220229993A1

公开(公告)日：2022-07-21

申请号：US17648376

申请日：2022-01-19

Applicant: Oracle International Corporation

Inventor： Duy Vu , Tuyen Quang Pham , Cong Duy Vu Hoang , Srinivasa Phani Kumar Gadde , Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi

IPC: G06F40/295 , G06F40/205 , G06F40/35 , G06F40/40 , G06V30/19

Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.

27.

发明授权
Techniques for out-of-domain (OOD) detection 有权

公开(公告)号：US12299402B2

公开(公告)日：2025-05-13

申请号：US18659606

申请日：2024-05-09

Applicant: Oracle International Corporation

Inventor： Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi , Crystal C. Pan , Vladislav Blinov , Cong Duy Vu Hoang , Elias Luqman Jalaluddin , Duy Vu , Balakota Srinivas Vinnakota

IPC: G06F40/30 , G06F40/205 , G06F40/289 , G06N20/00 , H04L51/02

Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances. One particular technique includes receiving an utterance and a target domain of a chatbot, generating a sentence embedding for the utterance, obtaining an embedding representation for each cluster of in-domain utterances associated with the target domain, predicting, using a metric learning model, a first probability that the utterance belongs to the target domain based on a similarity or difference between the sentence embedding and each embedding representation for each cluster, predicting, using an outlier detection model, a second probability that the utterance belongs to the target domain based on a determined distance or density deviation between the sentence embedding and embedding representations for neighboring clusters, evaluating the first probability and the second probability to determine a final probability, and classifying the utterance as in-domain or out-of-domain for the chatbot based on the final probability.

28.

发明授权
Distance-based logit value for natural language processing 有权

公开(公告)号：US12210842B2

公开(公告)日：2025-01-28

申请号：US18545621

申请日：2023-12-19

Applicant: Oracle International Corporation

Inventor： Ying Xu , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G10L15/16 , G06F40/35 , G06N20/00 , H04L51/02 , G06F40/205 , G06F40/253

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

29.

发明公开
OUTPUT INTERPRETATION FOR A MEANING REPRESENTATION LANGUAGE SYSTEM 审中-公开

公开(公告)号：US20240134850A1

公开(公告)日：2024-04-25

申请号：US18321144

申请日：2023-05-21

Applicant: Oracle International Corporation

Inventor： Chang Xu , Poorya Zaremoodi , Cong Duy Vu Hoang , Nitika Mathur , Philip Arthur , Steve Wai-Chun Siu , Aashna Devang Kanuga , Gioacchino Tangari , Mark Edward Johnson , Thanh Long Duong , Vishal Vishnoi , Stephen Andrew McRitchie , Christopher Mark Broadbent

IPC: G06F16/2452 , G06F40/211 , G06F40/30

CPC classification number: G06F16/24522 , G06F40/211 , G06F40/30

Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.

30.

发明公开
CALIBRATING CONFIDENCE SCORES OF A MACHINE LEARNING MODEL TRAINED AS A NATURAL LANGUAGE INTERFACE 审中-公开

公开(公告)号：US20240062021A1

公开(公告)日：2024-02-22

申请号：US18107624

申请日：2023-02-09

Applicant: Oracle International Corporation

Inventor： Gioacchino Tangari , Cong Duy Vu Hoang , Mark Edward Johnson , Poorya Zaremoodi , Nitika Mathur , Aashna Devang Kanuga , Thanh Long Duong

IPC: G06F40/58 , G06F40/253

CPC classification number: G06F40/58 , G06F40/253

Abstract: Techniques are disclosed herein for calibrating confidence scores of a machine learning model trained to translate natural language to a meaning representation language. The techniques include obtaining one or more raw beam scores generated from one or more beam levels of a decoder of a machine learning model trained to translate natural language to a logical form, where each of the one or more raw beam scores is a conditional probability of a sub-tree determined by a heuristic search algorithm of the decoder at one of the one or more beam levels, classifying, by a calibration model, a logical form output by the machine learning model as correct or incorrect based on the one or more raw beam scores, and providing the logical form with a confidence score that is determined based on the classifying of the logical form.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification