-
公开(公告)号:US20230401385A1
公开(公告)日:2023-12-14
申请号:US17966485
申请日:2022-10-14
Applicant: Oracle International Corporation
Inventor: Saransh Mehta , Siddhant Jain , Pramir Sarkar
IPC: G06F40/295 , G06N5/02 , G06F40/126
CPC classification number: G06F40/295 , G06N5/022 , G06F40/126
Abstract: A novel system is described for performing hierarchical named entity recognition (“HNER”) processing that includes identifying categories at different hierarchical levels for a named entity. The HNER system uses a novel architecture comprising an encoder model and a system of trained machine learning (ML) models to perform the HNER processing, where each trained model in the system of ML models corresponds to a particular hierarchical level, and each model is trained to extract one or more named entities and predict a category for each extracted named entity for the corresponding hierarchical level. Novel techniques are also described for training the various models in HNER system including an encoder model and models in the system of models.
-
公开(公告)号:US20240143934A1
公开(公告)日:2024-05-02
申请号:US18485700
申请日:2023-10-12
Applicant: Oracle International Corporation
Inventor: Poorya Zaremoodi , Duy Vu , Nagaraj N. Bhat , Srijon Sarkar , Varsha Kuppur Rajendra , Thanh Long Duong , Mark Edward Johnson , Pramir Sarkar , Shahid Reza
IPC: G06F40/30 , G06F40/284 , G06F40/289
CPC classification number: G06F40/30 , G06F40/284 , G06F40/289
Abstract: A method includes accessing document including sentences, document being associated with configuration flag indicating whether ABSA, SLSA, or both are to be performed; inputting the document into language model that generates chunks of token embeddings for the document; and, based on the configuration flag, performing at least one from among the ABSA and the SLSA by inputting the chunks of token embeddings into a multi-task model. When performing the SLSA, a part of token embeddings in each of the chunks is masked, and the masked token embeddings do not belong to a particular sentence on which the SLSA is performed.
-
公开(公告)号:US20250094732A1
公开(公告)日:2025-03-20
申请号:US18663988
申请日:2024-05-14
Applicant: Oracle International Corporation
Inventor: Ankit Kumar Aggarwal , Haad Khan , Liyu Gong , Jie Xing , Pramir Sarkar
IPC: G06F40/40
Abstract: A summary generation and summary selection system is disclosed that is capable of automatically evaluating multiple summaries generated for content and selecting a single summary that is deemed to be the “best” among the multiple generated summaries. The system includes capabilities to use multiple different selection techniques to select the best summary from multiple generated summaries. A first selection technique involves identifying entities and entity relationships from the content to be summarized and selecting a summary from multiple summaries generated for the content based on the entities and entity relationships identified in the content. A second selection technique involves determining a set of questions that are answered by each summary. The technique then selects a summary based upon the set of questions answered by each summary. The system then outputs the selected summary as the summary for the content.
-
公开(公告)号:US20240135116A1
公开(公告)日:2024-04-25
申请号:US18485779
申请日:2023-10-12
Applicant: Oracle International Corporation
Inventor: Duy Vu , Poorya Zaremoodi , Nagaraj N. Bhat , Srijon Sarkar , Varsha Kuppur Rajendra , Thanh Long Duong , Mark Edward Johnson , Pramir Sarkar , Shahid Reza
Abstract: A computer-implemented method includes: accessing a plurality of datasets, where each dataset of the plurality of datasets includes training examples; selecting datasets that include the training examples in a source language and a target language; and sampling, based on a sampling weight that is determined for each of the selected datasets, the training examples from the selected datasets to generate the training batches; training an ML model for performing at least a first task using the training examples of the training batches, by interleavingly inputting the training batches to the ML model; and outputting the trained ML model configured to perform the at least the first task on input utterances provided in at least one among the source language and the target language. The sampling weight is determined for each of the selected datasets based on one or more attributes common to the training examples of the selected dataset.
-
公开(公告)号:US20240103925A1
公开(公告)日:2024-03-28
申请号:US17954787
申请日:2022-09-28
Applicant: Oracle International Corporation
Inventor: Nagaraj N. Bhat , Joydeb Mondal , Amritanshu Jain , Pramir Sarkar
CPC classification number: G06F9/505 , G06F9/4887
Abstract: Techniques disclosed herein can include receiving an instruction to perform a stress test on one or more cloud computing resources of a cloud computing system. Worker nodes of the cloud computing system can be provisioned by a resource manager to perform the stress test on the cloud computing resources. The resource manager can instruct the one or more worker nodes of the cloud computing system to perform the stress test. Data generated by the worker nodes during the stress test can be received by the resource manager and used to train a projection framework comprising a trained machine learning model. The projection framework can generate a resource projection and the projection can be used to provision cloud computing resources to host the cloud service.
-
-
-
-