Techniques for extracting contextually structured data from document images

发明授权

US11049235B2 Techniques for extracting contextually structured data from document images 有权

请登陆查看更多内容

专利标题： Techniques for extracting contextually structured data from document images
申请号： US17083568

申请日： 2020-10-29
公开(公告)号： US11049235B2

公开(公告)日： 2021-06-29
发明人: David James Wheaton , William Robert Nadolski , Heather Michelle GoodyKoontz
申请人： SAS Institute Inc.
申请人地址： US NC Cary
专利权人： SAS Institute Inc.
当前专利权人： SAS Institute Inc.
当前专利权人地址： US NC Cary
代理机构： Kacvinsky Daisak Bluni PLLC
主分类号： G06K9/00
IPC分类号： G06K9/00 ; G06T7/00 ; G06F16/81 ; G06F16/93 ; G06F40/284 ; G06F40/186 ; G06F40/169 ; G06K9/68 ; G06K9/62

摘要：

Embodiments are generally directed to techniques for extracting contextually structured data from document images, such as by automatically identifying document layout, document data, and/or document metadata in a document image, for instance. Many embodiments are particularly directed to generating and utilizing a document template database for automatically extracting document image contents into a contextually structured format. For example, the document template database may include a plurality of templates for identifying/explaining key data elements in various document image formats that can be used to extract contextually structured data from incoming document images with a matching document image format. Several embodiments are particularly directed to automatically identifying and associating document metadata with corresponding document data in a document image, such as for generating a machine-facilitated annotation of the document image. In some embodiments, the machine-facilitated annotation of a document may be used to generate a template for the template database.

公开/授权文献

US20210110527A1 TECHNIQUES FOR EXTRACTING CONTEXTUALLY STRUCTURED DATA FROM DOCUMENT IMAGES 公开/授权日：2021-04-15

信息查询

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )