Patent search ap:("Oracle International Corporation") AND inv:"Srijon Sarkar" Page 1

1.

发明公开
MULTI-TASK MODEL WITH CONTEXT MASKING 审中-公开

公开(公告)号：US20240143934A1

公开(公告)日：2024-05-02

申请号：US18485700

申请日：2023-10-12

Applicant: Oracle International Corporation

Inventor： Poorya Zaremoodi , Duy Vu , Nagaraj N. Bhat , Srijon Sarkar , Varsha Kuppur Rajendra , Thanh Long Duong , Mark Edward Johnson , Pramir Sarkar , Shahid Reza

IPC: G06F40/30 , G06F40/284 , G06F40/289

CPC classification number: G06F40/30 , G06F40/284 , G06F40/289

Abstract: A method includes accessing document including sentences, document being associated with configuration flag indicating whether ABSA, SLSA, or both are to be performed; inputting the document into language model that generates chunks of token embeddings for the document; and, based on the configuration flag, performing at least one from among the ABSA and the SLSA by inputting the chunks of token embeddings into a multi-task model. When performing the SLSA, a part of token embeddings in each of the chunks is masked, and the masked token embeddings do not belong to a particular sentence on which the SLSA is performed.

2.

发明公开
DATA AUGMENTATION AND BATCH BALANCING FOR TRAINING MULTI-LINGUAL MODEL 审中-公开

公开(公告)号：US20240135116A1

公开(公告)日：2024-04-25

申请号：US18485779

申请日：2023-10-12

Applicant: Oracle International Corporation

Inventor： Duy Vu , Poorya Zaremoodi , Nagaraj N. Bhat , Srijon Sarkar , Varsha Kuppur Rajendra , Thanh Long Duong , Mark Edward Johnson , Pramir Sarkar , Shahid Reza

IPC: G06F40/58 , G06F40/20

CPC classification number: G06F40/58 , G06F40/20

Abstract: A computer-implemented method includes: accessing a plurality of datasets, where each dataset of the plurality of datasets includes training examples; selecting datasets that include the training examples in a source language and a target language; and sampling, based on a sampling weight that is determined for each of the selected datasets, the training examples from the selected datasets to generate the training batches; training an ML model for performing at least a first task using the training examples of the training batches, by interleavingly inputting the training batches to the ML model; and outputting the trained ML model configured to perform the at least the first task on input utterances provided in at least one among the source language and the target language. The sampling weight is determined for each of the selected datasets based on one or more attributes common to the training examples of the selected dataset.

3.

发明申请
TEXT SPAN PREDICTION BASED ON ENTITY TYPE 有权

公开(公告)号：US20250148206A1

公开(公告)日：2025-05-08

申请号：US18502205

申请日：2023-11-06

Applicant: Oracle International Corporation

Inventor： Suman Roy , Srijon Sarkar

IPC: G06F40/284 , G06F40/40

Abstract: Machine learning techniques directed to span prediction for textual data are disclosed. As used herein, span prediction is the process of predicting the possible spans of text that can be assigned to a given entity type of a set of predefined entity types. To this end, a machine learning model can be trained to generate values that indicate the predicted probability that a given span of an identified set of spans within text of interest is appropriate for association with a given entity type of the set of predefined entity types. The predicted probability values may be used to determine whether a given span or spans is associated with a given entity type. The predicted spans can also be scored in some examples.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification