Patent search ap:("Oracle International Corporation") AND inv:"Shivashankar Subramanian" Page 1

1.

发明公开
TECHNIQUES FOR POSITIVE ENTITY AWARE AUGMENTATION USING TWO-STAGE AUGMENTATION 审中-公开

公开(公告)号：US20230419052A1

公开(公告)日：2023-12-28

申请号：US18163231

申请日：2023-02-01

Applicant: Oracle International Corporation

Inventor： Ahmed Ataallah Ataallah Abobakr , Shivashankar Subramanian , Ying Xu , Vladislav Blinov , Umanga Bista , Tuyen Quang Pham , Thanh Long Duong , Mark Edward Johnson , Elias Luqman Jalaluddin , Vanshika Sridharan , Xin XU , Srinivasa Phani Kumar Gadde , Vishal Vishnoi

IPC: G06F40/56 , G06F40/295 , G06F40/247

CPC classification number: G06F40/56 , G06F40/247 , G06F40/295

Abstract: Novel techniques are described for positive entity-aware augmentation using a two-stage augmentation to improve the stability of the model to entity value changes for intent prediction. In one particular aspect, a method is provided that includes accessing a first set of training data for an intent prediction model, the first set of training data comprising utterances and intent labels; applying one or more positive data augmentation techniques to the first set of training data, depending on the tuning requirements for hyper-parameters, to result in a second set of training data, where the positive data augmentation techniques comprise Entity-Aware (“EA”) technique and a two-stage augmentation technique; combining the first set of training data and the second set of training data to generate expanded training data; and training the intent prediction model using the expanded training data.

2.

发明公开
NAMED ENTITY BIAS DETECTION AND MITIGATION TECHNIQUES FOR SENTENCE SENTIMENT ANALYSIS 审中-公开

公开(公告)号：US20230153687A1

公开(公告)日：2023-05-18

申请号：US17984717

申请日：2022-11-10

Applicant: Oracle International Corporation

Inventor： Duy Vu , Varsha Kuppur Rajendra , Shivashankar Subramanian , Ahmed Ataallah Ataallah Abobakr , Thanh Long Duong , Mark Edward Johnson

IPC: G06N20/00 , G06K9/62

CPC classification number: G06N20/00 , G06K9/6259 , G06K9/6262

Abstract: Techniques for named entity bias detection and mitigation for sentence sentiment analysis. In one particular aspect, a method is provided that includes obtaining a training set of labeled examples for training a machine learning model to classify sentiment, preparing a list of named entities using one or more data sources, for each example in the training set of labeled examples with a named entity, replacing the named entity with a corresponding entity type tag to generate a labeled template data set, executing a sampling process for each entity type t within the labeled template data set to generate a augmented invariance data set comprising one or more invariance groups having labeled examples for each entity type t, and training the machine learning model using labeled examples from the augmented invariance data set.

3.

发明公开
ADDRESSING CATASTROPHIC FORGETTING AND OVER-GENERALIZATION WHILE TRAINING A NATURAL LANGUAGE TO A MEANING REPRESENTATION LANGUAGE SYSTEM 审中-公开

公开(公告)号：US20240062044A1

公开(公告)日：2024-02-22

申请号：US18451995

申请日：2023-08-18

Applicant: Oracle International Corporation

Inventor： Shivashankar Subramanian , Dalu Guo , Gioacchino Tangari , Nitika Mathur , Cong Duy Vu Hoang , Mark Edward Johnson , Thanh Long Duong

IPC: G06N3/0455 , G06F40/58 , G06N3/006 , G06N3/084

CPC classification number: G06N3/0455 , G06F40/58 , G06N3/006 , G06N3/084

Abstract: Techniques are disclosed herein for addressing catastrophic forgetting and over-generalization while training a model to transform natural language to a logical form such as a meaning representation language. The techniques include accessing training data comprising natural language examples, augmenting the training data to generate expanded training data, training a machine learning model on the expanded training data, and providing the trained machine learning model. The augmenting includes (i) generating contrastive examples by revising natural language of examples identified to have caused regression during training of a machine learning model with the training data, (ii) generating alternative examples by modifying operators of examples identified within the training data that belong to a concept that exhibits bias, or (iii) a combination of (i) and (ii).

4.

发明公开
TRAINING DATA AUGMENTATION USING GAZETTEERS AND PERTURBATIONS TO FACILITATE TRAINING NAMED ENTITY RECOGNITION MODELS 审中-公开

公开(公告)号：US20230325599A1

公开(公告)日：2023-10-12

申请号：US18185675

申请日：2023-03-17

Applicant: Oracle International Corporation

Inventor： Omid Mohamad Nezami , Shivashankar Subramanian , Thanh Tien Vu , Tuyen Quang Pham , Budhaditya Saha , Aashna Devang Kanuga , Shubham Pawankumar Shah

IPC: G06F40/295 , G06N3/006

CPC classification number: G06F40/295 , G06N3/006

Abstract: Techniques are provided for augmenting training data using gazetteers and perturbations to facilitate training named entity recognition models. The training data can be augmented by generating additional utterances from original utterances in the training data and combining the generated additional utterances with the original utterances to form the augmented training data. The additional utterances can be generated by replacing the named entities in the original utterances with different named entities and/or perturbed versions of the named entities in the original utterances selected from a gazetteer. Gazetteers of named entities can be generated from the training data and expanded by searching a knowledge base and/or perturbing the named entities therein. The named entity recognition model can be trained using the augmented training data.

5.

发明公开
TRANSFORMING NATURAL LANGUAGE TO STRUCTURED QUERY LANGUAGE BASED ON SCALABLE SEARCH AND CONTENT-BASED SCHEMA LINKING 审中-公开

公开(公告)号：US20230186025A1

公开(公告)日：2023-06-15

申请号：US18065387

申请日：2022-12-13

Applicant: Oracle International Corporation

Inventor： Jae Min John , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Balakota Srinivas Vinnakota , Shivashankar Subramanian , Cong Duy Vu Hoang , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Nitika Mathur , Aashna Devang Kanuga , Philip Arthur , Gioacchino Tangari , Steve Wai-Chun Siu

IPC: G06F40/284 , G06F40/295 , G06F40/42

CPC classification number: G06F40/284 , G06F40/295 , G06F40/42

Abstract: Techniques for preprocessing data assets to be used in a natural language to logical form model based on scalable search and content-based schema linking. In one particular aspect, a method includes accessing an utterance, classifying named entities within the utterance into predefined classes, searching value lists within the database schema using tokens from the utterance to identify and output value matches including: (i) any value within the value lists that matches a token from the utterance and (ii) any attribute associated with a matching value, generating a data structure by organizing and storing: (i) each of the named entities and an assigned class for each of the named entities, (ii) each of the value matches and the token matching each of the value matches, and (iii) the utterance, in a predefined format for the data structure, and outputting the data structure.

6.

发明公开
DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS 审中-公开

公开(公告)号：US20230153528A1

公开(公告)日：2023-05-18

申请号：US17984743

申请日：2022-11-10

Applicant: Oracle International Corporation

Inventor： Duy Vu , Varsha Kuppur Rajendra , Dai Hoang Tran , Shivashankar Subramanian , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson

IPC: G06F40/279 , G06F40/166 , G06N5/02

CPC classification number: G06F40/279 , G06F40/166 , G06N5/022

Abstract: Techniques for augmentation and batch balancing of training data to enhance negation and fairness of a machine learning model. In one particular aspect, a method is provided that includes generating a list of demographic words associated with a demographic group, searching an unlabeled corpus of text to identify unlabeled examples in a target domain comprising at least one demographic word from the list of demographic words, rewriting the unlabeled examples to create one or more versions of each of the unlabeled examples and generate a fairness invariance data set, and training the machine learning model using unlabeled examples from the fairness invariance data set.

7.

发明公开
SYSTEM AND METHOD OF SELECTIVE FINE-TUNING FOR CUSTOM TRAINING OF A NATURAL LANGUAGE TO LOGICAL FORM MODEL 审中-公开

公开(公告)号：US20240061835A1

公开(公告)日：2024-02-22

申请号：US18236192

申请日：2023-08-21

Applicant: Oracle International Corporation

Inventor： Shivashankar Subramanian , Gioacchino Tangari , Thanh Tien Vu , Cong Duy Vu Hoang , Poorya Zaremoodi , Dalu Guo , Mark Edward Johnson , Thanh Long Duong

IPC: G06F16/2452 , G06F16/25

CPC classification number: G06F16/24522 , G06F16/252

Abstract: Systems and methods fine-tune a pretrained machine learning model. For a model having multiple layers, an initial set of configurations is identified, each configuration establishing layers to be frozen and layers to be fine-tuned. A configuration that is optimized with respect to one or more parameters is selected, establishing a set of fine-tuning layers and a set of frozen layers. An input for the model is provided to a remote system. An output of the set of frozen layers of the model, given the provided input, is received back and locally stored. The set of fine-tuning layers of the model is loaded from the remote system. The model is fine-tuned by retrieving the locally stored output of the set of frozen layers, and updating weights associated with the set of fine-tuning layers of the machine learning model.

8.

发明公开
TECHNIQUES FOR CONVERTING A NATURAL LANGUAGE UTTERANCE TO AN INTERMEDIATE DATABASE QUERY REPRESENTATION 审中-公开

公开(公告)号：US20240061832A1

公开(公告)日：2024-02-22

申请号：US18209844

申请日：2023-06-14

Applicant: Oracle International Corporation

Inventor： Cong Duy Vu Hoang , Stephen Andrew McRitchie , Mark Edward Johnson , Shivashankar Subramanian , Aashna Devang Kanuga , Nitika Mathur , Gioacchino Tangari , Steve Wai-Chun Siu , Poorya Zaremoodi , Vasisht Raghavendra , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Christopher Mark Broadbent , Philip Arthur , Syed Najam Abbas Zaidi

IPC: G06F16/2452 , G06F16/2455 , G06F16/242

CPC classification number: G06F16/24522 , G06F16/24561 , G06F16/2433

Abstract: Techniques are disclosed herein for converting a natural language utterance to an intermediate database query representation. An input string is generated by concatenating a natural language utterance with a database schema representation for a database. Based on the input string, a first encoder generates one or more embeddings of the natural language utterance and the database schema representation. A second encoder encodes relations between elements in the database schema representation and words in the natural language utterance based on the one or more embeddings. A grammar-based decoder generates an intermediate database query representation based on the encoded relations and the one or more embeddings. Based on the intermediate database query representation and an interface specification, a database query is generated in a database query language.

9.

发明公开
TECHNIQUES FOR NEGATIVE ENTITY AWARE AUGMENTATION 审中-公开

公开(公告)号：US20230419127A1

公开(公告)日：2023-12-28

申请号：US18163235

申请日：2023-02-01

Applicant: Oracle International Corporation

Inventor： Ahmed Ataallah Ataallah Abobakr , Shivashankar Subramanian , Ying Xu , Vladislav Blinov , Umanga Bista , Tuyen Quang Pham , Thanh Long Duong , Mark Edward Johnson , Elias Luqman Jalaluddin , Vanshika Sridharan , Xin Xu , Srinivasa Phani Kumar Gadde , Vishal Vishnoi

IPC: G06N5/022

CPC classification number: G06N5/022

Abstract: Novel techniques are described for negative entity-aware augmentation using a two-stage augmentation to improve the stability of the model to entity value changes for intent prediction. In some embodiments, a method comprises accessing a first set of training data for an intent prediction model, the first set of training data comprising utterances and intent labels; applying one or more negative entity-aware data augmentation techniques to the first set of training data, depending on the tuning requirements for hyper-parameters, to result in a second set of training data, where the one or more negative entity-aware data augmentation techniques comprise Keyword Augmentation Technique (“KAT”) plus entity without context technique and KAT plus entity in random context as OOD technique; combining the first set of training data and the second set of training data to generate expanded training data; and training the intent prediction model using the expanded training data.

10.

发明公开
TECHNIQUES FOR TWO-STAGE ENTITY-AWARE DATA AUGMENTATION 审中-公开

公开(公告)号：US20230419040A1

公开(公告)日：2023-12-28

申请号：US18163230

申请日：2023-02-01

Applicant: Oracle International Corporation

Inventor： Ahmed Ataallah Ataallah Abobakr , Shivashankar Subramanian , Ying Xu , Vladislav Blinov , Umanga Bista , Tuyen Quang Pham , Thanh Long Duong , Mark Edward Johnson , Elias Luqman Jalaluddin , Vanshika Sridharan , Xin Xu , Srinivasa Phani Kumar Gadde , Vishal Vishnoi

IPC: G06F40/295 , G06F40/247 , G06N5/04

CPC classification number: G06F40/295 , G06F40/247 , G06N5/04

Abstract: Novel techniques are described for data augmentation using a two-stage entity-aware augmentation to improve model robustness to entity value changes for intent prediction. In some embodiments, a method comprises accessing a first set of training data for an intent prediction model; applying one or more data augmentation techniques to the first set of training data to result in a second set of training data; applying an additional augmentation technique to augment the second set of training data to create a post-processed augmented training data where the additional augmentation technique comprises replacing at least one or more entity values of the named entities within the second set of training data with random values of same entity type; and combining the first set of training data and the post-processed augmented training data to

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification