Patent search ap:("Oracle International Corporation") AND inv:"Yuying Wang" Page 1

1.

发明申请
VISION-BASED DOCUMENT LANGUAGE IDENTIFICATION BY JOINT SUPERVISION 有权

公开(公告)号：US20230067033A1

公开(公告)日：2023-03-02

申请号：US17897055

申请日：2022-08-26

Applicant: Oracle International Corporation

Inventor： Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian

IPC: G06V30/246 , G06F40/263 , G06V10/82

Abstract: The present embodiments relate to a language identification system for predicting a language and text content of text lines in an image-based document. The language identification system uses a trainable neural network model that integrates multiple neural network models in a single unified end-to-end trainable architecture. A CNN and an RNN of the model can process text lines and derive visual and contextual features of the text lines. The derived features can be used to predict a language and text content for the text line. The CNN and the RNN can be jointly trained by determining losses based on the predicted language and content and corresponding language labels and text labels for each text line.

2.

发明公开
SYNTHETIC DATA FINE-TUNED OPTICAL CHARACTER RECOGNITION ENGINE FOR EXTENSIBLE MARKUP LANGUAGE DOCUMENT RECONSTRUCTION 审中-公开

公开(公告)号：US20240338958A1

公开(公告)日：2024-10-10

申请号：US18131744

申请日：2023-04-06

Applicant: Oracle International Corporation

Inventor： Liyu Gong , Yuying Wang , Mengqing Guo , Tao Sheng , Jun Qian

IPC: G06V30/19 , G06F40/143 , G06V10/70

CPC classification number: G06V30/19147 , G06F40/143 , G06V10/70

Abstract: Techniques are disclosed for optical character recognition of extensible markup language content. A method can include a system generating a first training data comprising extensible markup language (XML) content, the first training data comprising a first plurality of training instances, each training instance including a respective image comprising XML content and annotation information for the respective image. The system can train a plurality of machine learning models using the first training data to generate a plurality of trained machine learning models, to perform image-based XML content extraction. The system can generate a plurality of trained machine learning models based at least in part on the training.

3.

发明授权
Vision-based document language identification by joint supervision 有权

公开(公告)号：US12249170B2

公开(公告)日：2025-03-11

申请号：US17897055

申请日：2022-08-26

Applicant: Oracle International Corporation

Inventor： Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian

IPC: G06F40/263 , G06V10/82 , G06V30/246

Abstract: The present embodiments relate to a language identification system for predicting a language and text content of text lines in an image-based document. The language identification system uses a trainable neural network model that integrates multiple neural network models in a single unified end-to-end trainable architecture. A CNN and an RNN of the model can process text lines and derive visual and contextual features of the text lines. The derived features can be used to predict a language and text content for the text line. The CNN and the RNN can be jointly trained by determining losses based on the predicted language and content and corresponding language labels and text labels for each text line.

4.

发明公开
AUTOMATED GENERATION OF TRAINING DATA COMPRISING DOCUMENT IMAGES AND ASSOCIATED LABEL DATA 审中-公开

公开(公告)号：US20230316792A1

公开(公告)日：2023-10-05

申请号：US17692844

申请日：2022-03-11

Applicant: Oracle International Corporation

Inventor： Yazhe Hu , Yuying Wang , Liyu Gong , Iman Zadeh , Jun Qian

IPC: G06V30/19 , G06N20/00

CPC classification number: G06V30/19147 , G06N20/00 , G06V30/1916

Abstract: Techniques are described for automatically, and substantially without human intervention, generating training data where the training data includes a set of training images containing text content and associated label data. Both the training images and the associated label data are automatically generated. The label data that is automatically generated for a training image includes one or more labels identifying locations of one or more text portions within the training image, and for each text portion, a label indicative of the text content in the text portion. By automating both the generation of training images and the generation of associated label data, the techniques described herein are very scalable and repeatable and can be used to generate large amounts of training data, which in turn enables building more reliable and accurate language models.

5.

发明申请
AUTOMATIC LANGUAGE IDENTIFICATION IN IMAGE-BASED DOCUMENTS 有权

公开(公告)号：US20230066922A1

公开(公告)日：2023-03-02

申请号：US17897066

申请日：2022-08-26

Applicant: Oracle International Corporation

Inventor： Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian

IPC: G06V30/244 , G06V30/246

Abstract: The present embodiments relate to identifying a native language of text included in an image-based document. A cloud infrastructure node (e.g., one or more interconnected computing devices implementing a cloud infrastructure) can utilize one or more deep learning models to identify a language of an image-based document (e.g., a scanned document) that is formed of pixels. The cloud infrastructure node can detect text lines that are bounded by bounding boxes in the document, determine a primary script classification of the text in the document, and derive a primary language for the document. Various document management tasks can be performed responsive to determining the language, such as perform optical character recognition (OCR) or derive insights into the text.

Patent Agency Ranking