-
21.
公开(公告)号:US20230236955A1
公开(公告)日:2023-07-27
申请号:US18157750
申请日:2023-01-20
Applicant: Oracle International Corporation
Inventor: Fuheng Wu , Ivan Dimitrov Davchev , Jun Qian
IPC: G06F11/36
CPC classification number: G06F11/3664 , G06F11/3612
Abstract: A computing device may access a target code for implementing an application. The device may identify addresses for one or more functions or one or more variables associated with the target code. The device may generate an interval tree comprising a root node and one or more function nodes. The device may in response to the target code invoking a function or variable: generate an intercept function configured to intercept communication between the target code and a call address for the at least one of the one or more functions or the one or more variables invoked by the target code. The device may intercept data communicated between the target code and the call address. The device may store the intercepted data as a function node in the interval tree. The device may transmit the interval tree to a user device.
-
公开(公告)号:US20230066922A1
公开(公告)日:2023-03-02
申请号:US17897066
申请日:2022-08-26
Applicant: Oracle International Corporation
Inventor: Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian
IPC: G06V30/244 , G06V30/246
Abstract: The present embodiments relate to identifying a native language of text included in an image-based document. A cloud infrastructure node (e.g., one or more interconnected computing devices implementing a cloud infrastructure) can utilize one or more deep learning models to identify a language of an image-based document (e.g., a scanned document) that is formed of pixels. The cloud infrastructure node can detect text lines that are bounded by bounding boxes in the document, determine a primary script classification of the text in the document, and derive a primary language for the document. Various document management tasks can be performed responsive to determining the language, such as perform optical character recognition (OCR) or derive insights into the text.
-
公开(公告)号:US20250094687A1
公开(公告)日:2025-03-20
申请号:US18758441
申请日:2024-06-28
Applicant: Oracle International Corporation
Inventor: Zheng Wang , Yazhe Hu , Mengqing Guo , Tao Sheng , Jun Qian , Vinod Murli Mamtani
IPC: G06F40/166 , G06F40/253
Abstract: Techniques for generating repetition-free text using a large language model (LLM) are provided. In one technique, textual content that was generated by an LLM is accessed, where the textual content comprises a plurality of sub-components including a first sub-component and a second sub-component. A first embedding that represents the first sub-component is generated and a second embedding that represents the second sub-component is generated. Based on a similarity between the first embedding and the second embedding, it is determined whether the second sub-component is repetitious with respect to the first sub-component. In response to determining that the second sub-component is repetitious with respect to the first sub-component, at least a portion of the second sub-component is removed from the textual content.
-
公开(公告)号:US20250094686A1
公开(公告)日:2025-03-20
申请号:US18758321
申请日:2024-06-28
Applicant: Oracle International Corporation
Inventor: Zheng Wang , Yazhe Hu , Mengqing Guo , Tao Sheng , Jun Qian , Vinod Murli Mamtani
IPC: G06F40/166 , G06F40/279
Abstract: Techniques for modifying a narrative point of view for content generated by a machine-learned model, such as a large language model (LLM), are provided. In one technique, a first textual content that was generated by an LLM is accessed. A narrative point of view (NPOV) detection operation is performed on a first portion of the first textual content to identify a first NPOV corresponding to the first portion of the first textual content. Based on an output, of the NPOV detection operation, that indicates that the first NPOV does not meet one or more NPOV criteria, the first portion of the first textual content is modified to generate a modified textual content. The modified textual content is submitted to the LLM, causing the LLM to generate a second textual content.
-
公开(公告)号:US20230067033A1
公开(公告)日:2023-03-02
申请号:US17897055
申请日:2022-08-26
Applicant: Oracle International Corporation
Inventor: Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian
IPC: G06V30/246 , G06F40/263 , G06V10/82
Abstract: The present embodiments relate to a language identification system for predicting a language and text content of text lines in an image-based document. The language identification system uses a trainable neural network model that integrates multiple neural network models in a single unified end-to-end trainable architecture. A CNN and an RNN of the model can process text lines and derive visual and contextual features of the text lines. The derived features can be used to predict a language and text content for the text line. The CNN and the RNN can be jointly trained by determining losses based on the predicted language and content and corresponding language labels and text labels for each text line.
-
-
-
-