-
公开(公告)号:US20230067033A1
公开(公告)日:2023-03-02
申请号:US17897055
申请日:2022-08-26
Applicant: Oracle International Corporation
Inventor: Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian
IPC: G06V30/246 , G06F40/263 , G06V10/82
Abstract: The present embodiments relate to a language identification system for predicting a language and text content of text lines in an image-based document. The language identification system uses a trainable neural network model that integrates multiple neural network models in a single unified end-to-end trainable architecture. A CNN and an RNN of the model can process text lines and derive visual and contextual features of the text lines. The derived features can be used to predict a language and text content for the text line. The CNN and the RNN can be jointly trained by determining losses based on the predicted language and content and corresponding language labels and text labels for each text line.
-
公开(公告)号:US12249170B2
公开(公告)日:2025-03-11
申请号:US17897055
申请日:2022-08-26
Applicant: Oracle International Corporation
Inventor: Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian
IPC: G06F40/263 , G06V10/82 , G06V30/246
Abstract: The present embodiments relate to a language identification system for predicting a language and text content of text lines in an image-based document. The language identification system uses a trainable neural network model that integrates multiple neural network models in a single unified end-to-end trainable architecture. A CNN and an RNN of the model can process text lines and derive visual and contextual features of the text lines. The derived features can be used to predict a language and text content for the text line. The CNN and the RNN can be jointly trained by determining losses based on the predicted language and content and corresponding language labels and text labels for each text line.
-
公开(公告)号:US20230066922A1
公开(公告)日:2023-03-02
申请号:US17897066
申请日:2022-08-26
Applicant: Oracle International Corporation
Inventor: Liyu Gong , Yuying Wang , Zhonghai Deng , Iman Zadeh , Jun Qian
IPC: G06V30/244 , G06V30/246
Abstract: The present embodiments relate to identifying a native language of text included in an image-based document. A cloud infrastructure node (e.g., one or more interconnected computing devices implementing a cloud infrastructure) can utilize one or more deep learning models to identify a language of an image-based document (e.g., a scanned document) that is formed of pixels. The cloud infrastructure node can detect text lines that are bounded by bounding boxes in the document, determine a primary script classification of the text in the document, and derive a primary language for the document. Various document management tasks can be performed responsive to determining the language, such as perform optical character recognition (OCR) or derive insights into the text.
-
-