Model-independent confidence values for extracted document information using a convolutional neural network

Invention Grant

US11557140B2 Model-independent confidence values for extracted document information using a convolutional neural network 有权

Please log in to see more content

Patent Title: Model-independent confidence values for extracted document information using a convolutional neural network
Application No.: US17107223

Application Date: 2020-11-30
Publication No.: US11557140B2

Publication Date: 2023-01-17
Inventor: Christian Reisswig
Applicant: SAP SE
Applicant Address: DE Walldorf
Assignee: SAP SE
Current Assignee: SAP SE
Current Assignee Address: DE Walldorf
Agency: Sterne, Kessler, Goldstein & Fox P.L.L.C.
Main IPC: G06V30/416
IPC: G06V30/416 ; G06F40/30

Model-independent confidence values for extracted document information using a convolutional neural network

Abstract:

Disclosed herein are system, method, and computer program product embodiments for correcting extracted document information based on generated confidence and correctness scores. In an embodiment, a document correcting system may receive a document and document information that represents information extracted from the document. The document correcting system may determine the correctness of the document information by processing the document to generate a character grid representing textual information and spatial arrangements for the text within the document. The document correcting system may apply a convolutional neural network on character grid and the document information. The convolutional neural network may output corrected document information, a correctness value indicating the possible errors in the document information, and a confidence value indicating a likelihood of the possible errors.

Public/Granted literature

US20220171967A1 MODEL-INDEPENDENT CONFIDENCE VALUES FOR EXTRACTED DOCUMENT INFORMATION USING A CONVOLUTIONAL NEURAL NETWORK Public/Granted day:2022-06-02

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06V	图像或视频识别或理解
G06V30/00	字符识别；数字墨迹识别；面向文档的基于图像的模式识别（文档等的扫描、传输或复制 H04N1/00）
G06V30/40	.面向文档的基于图像的模式识别
G06V30/41	..文件内容分析（基于代码标记的印刷字符识别G06V30/224）
G06V30/416	...提取逻辑结构，例如章、节或页码；识别文档的元素，例如作者