-
公开(公告)号:US08442810B2
公开(公告)日:2013-05-14
申请号:US13626722
申请日:2012-09-25
IPC分类号: G06L17/28
CPC分类号: G06F17/289 , G06F17/2755 , G06F17/277 , G06F17/2785 , G06F17/2818 , G06F17/2872
摘要: In one embodiment, the invention provides a method for machine translation of a source document in an input language to a target document in an output language, comprising generating translation options corresponding to at least portions of each sentence in the input language; and selecting a translation option for the sentence based on statistics associated with the translation options.
-
公开(公告)号:US08908969B2
公开(公告)日:2014-12-09
申请号:US13562791
申请日:2012-07-31
CPC分类号: G06F17/212 , G06K9/00469 , G06K9/2072 , G06K2209/01 , Y10S707/99933
摘要: In one embodiment, the invention provides a method, comprising detecting data fields on a scanned document image; generating a flexible document description based on the detected data fields, including creating a set of search elements for each data field, each search element having associated search criteria; and training or modifying the flexible document description using, for example, a search algorithm to detect the data fields on additional training images based on the set of search elements.
摘要翻译: 在一个实施例中,本发明提供一种方法,包括检测扫描的文档图像上的数据字段; 基于检测到的数据字段生成灵活的文档描述,包括为每个数据字段创建一组搜索元素,每个搜索元素具有相关联的搜索准则; 以及使用例如搜索算法来训练或修改柔性文档描述,以基于搜索元素集来检测附加训练图像上的数据字段。
-
公开(公告)号:US20130044943A1
公开(公告)日:2013-02-21
申请号:US13659289
申请日:2012-10-24
发明人: Diar Tuganbaev
IPC分类号: G06K9/18
CPC分类号: G06K9/6292 , G06K2209/01
摘要: Techniques and methods are disclosed herein for combining and weighting of values from and associated with classifiers. Classifiers are used to recognize characters as part of an optical character recognition (OCR) system. Various methods of normalization facilitate combining of results of classifiers. For example, weight values may be entered into a weight table having two columns, one that includes weights from comparing patterns with images of correct characters, the other column includes weights from comparing patterns with images of incorrect characters.
摘要翻译: 本文公开的技术和方法用于组合和加权分类器的值并与分类器相关联。 分类器用于识别字符作为光学字符识别(OCR)系统的一部分。 各种归一化方法有助于结合分类器的结果。 例如,可以将权重值输入到具有两列的权重表中,一个包括将模式与正确字符的图像进行比较的权重,另一列包括将模式与不正确字符的图像进行比较的权重。
-
公开(公告)号:US08538162B2
公开(公告)日:2013-09-17
申请号:US13431767
申请日:2012-03-27
IPC分类号: G06K9/46
CPC分类号: H04N1/00795 , H04N1/00803 , H04N2101/00 , H04N2201/3216
摘要: A method for processing a batch of scanned images is disclosed. The method includes processing the scanned images into documents. For documents of multiple pages, the method maintains a page-based coordinate system to specify a location of structures within a page and joins the pages to form a multi-page sheet associated with a sheet-based coordinate system to specify a location of structures within the multi-page sheet. Data may be extracted from each document through a page mode wherein structures are detected on individual pages using the page-based coordinate system and a document mode wherein structures are detected within the entire document using the sheet-based coordinate system.
摘要翻译: 公开了一种用于处理一批扫描图像的方法。 该方法包括将扫描的图像处理成文档。 对于多页的文档,该方法维护基于页面的坐标系,以指定页面内的结构的位置并加入页面以形成与基于页面的坐标系相关联的多页表格,以指定结构的位置 多页表。 可以通过页面模式从每个文档中提取数据,其中使用基于页面的坐标系在各个页面上检测结构,并且使用基于纸张的坐标系统在整个文档内检测结构的文档模式。
-
公开(公告)号:US20130198615A1
公开(公告)日:2013-08-01
申请号:US13562791
申请日:2012-07-31
IPC分类号: G06F17/21
CPC分类号: G06F17/212 , G06K9/00469 , G06K9/2072 , G06K2209/01 , Y10S707/99933
摘要: In one embodiment, the invention provides a method, comprising detecting data fields on a scanned document image; generating a flexible document description based on the detected data fields, including creating a set of search elements for each data field, each search element having associated search criteria; and training or modifying the flexible document description using, for example, a search algorithm to detect the data fields on additional training images based on the set of search elements.
摘要翻译: 在一个实施例中,本发明提供一种方法,包括检测扫描的文档图像上的数据字段; 基于检测到的数据字段生成灵活的文档描述,包括为每个数据字段创建一组搜索元素,每个搜索元素具有相关联的搜索准则; 以及使用例如搜索算法来训练或修改柔性文档描述,以基于搜索元素集来检测附加训练图像上的数据字段。
-
公开(公告)号:US20130024180A1
公开(公告)日:2013-01-24
申请号:US13626722
申请日:2012-09-25
CPC分类号: G06F17/289 , G06F17/2755 , G06F17/277 , G06F17/2785 , G06F17/2818 , G06F17/2872
摘要: In one embodiment, the invention provides a method for machine translation of a source document in an input language to a target document in an output language, comprising generating translation options corresponding to at least portions of each sentence in the input language; and selecting a translation option for the sentence based on statistics associated with the translation options.
-
-
-
-
-