RECORDING MEDIUM FOR RECORDING LOGICAL STRUCTURE MODEL CREATION ASSISTANCE PROGRAM, LOGICAL STRUCTURE MODEL CREATION ASSISTANCE DEVICE AND LOGICAL STRUCTURE MODEL CREATION ASSISTANCE METHOD
    1.
    发明申请
    RECORDING MEDIUM FOR RECORDING LOGICAL STRUCTURE MODEL CREATION ASSISTANCE PROGRAM, LOGICAL STRUCTURE MODEL CREATION ASSISTANCE DEVICE AND LOGICAL STRUCTURE MODEL CREATION ASSISTANCE METHOD 有权
    用于记录逻辑结构模型创建辅助程序,逻辑结构模型创建辅助装置和逻辑结构模型创建辅助方法的记录介质

    公开(公告)号:US20090148049A1

    公开(公告)日:2009-06-11

    申请号:US12328442

    申请日:2008-12-04

    IPC分类号: G06K9/46

    CPC分类号: G06F17/243

    摘要: A method for assisting in the creation of a logical structure model, which stores, from an image in which character strings associated respectively with a plurality of logical elements constituting a logical structure are described, the logical elements, character strings associated with the logical elements, and the logical structure, wherein character strings in an input image and the logical structure among the character strings in the input image are extracted, a logical element is selected among the plurality of logical elements according to the degrees of similarity between the extracted character strings and the character string associated respectively with the plurality of logical elements stored in the logical structure model, a character string associated with the selected logical element and a character string in the input image associated with the logical element based on the logical structure among the extracted character strings in the input image are extracted.

    摘要翻译: 一种辅助创建逻辑结构模型的方法,该逻辑结构模型存储从其中描述了分别与构成逻辑结构的多个逻辑元件相关联的字符串的图像,逻辑元素,与逻辑元素相关联的字符串, 以及逻辑结构,其中输入图像中的字符串和输入图像中的字符串之间的逻辑结构被提取,根据提取的字符串之间的相似度和多个逻辑元素之间的相似度来选择逻辑元素, 分别与存储在逻辑结构模型中的多个逻辑元素相关联的字符串,与所选择的逻辑元素相关联的字符串和基于提取的字符串中的逻辑结构与逻辑元素相关联的输入图像中的字符串 在输入图像中提取。

    Character area extracting device, imaging device having character area extracting function, recording medium saving character area extracting programs, and character area extracting method
    4.
    发明授权
    Character area extracting device, imaging device having character area extracting function, recording medium saving character area extracting programs, and character area extracting method 有权
    字符区域提取装置,具有字符区域提取功能的成像装置,记录介质保存字符区域提取程序和字符区域提取方法

    公开(公告)号:US08447113B2

    公开(公告)日:2013-05-21

    申请号:US13067133

    申请日:2011-05-11

    IPC分类号: G06K9/18

    摘要: A character area extracting device includes a reflective and non-reflective area separation unit separating image data into reflective and non-reflective areas, and binarizing the image data by changing a first threshold value when it is inappropriate; a reflective area binarizing unit separating the reflective area into character and background areas, and binarizing it by changing a second threshold value when it is inappropriate; a non-reflective area binarizing unit separating the non-reflective area into the character and background areas, and binarizing it by changing a third threshold value when it is inappropriate; a reflective and non-reflective area separation evaluation unit; and a line extracting unit connecting the character areas of the reflective and non-reflective areas and extracting positional information of the connected character areas in the image data.

    摘要翻译: 字符区域提取装置包括将图像数据分离成反射和非反射区域的反射和非反射区域分离单元,并且当不合适时通过改变第一阈值来二值化图像数据; 反射区域二值化单元,将反射区域分离成字符和背景区域,并且当不适当时通过改变第二阈值来对其进行二值化; 非反射区域二值化单元,将非反射区域分离成字符和背景区域,并且当不合适时通过改变第三阈值来二值化; 反射和非反射区域分离评估单元; 以及线提取单元,连接反射区域和非反射区域的字符区域,并提取图像数据中连接的字符区域的位置信息。

    Character area extracting device, imaging device having character area extracting function, recording medium saving character area extracting programs, and character area extracting method
    5.
    发明申请
    Character area extracting device, imaging device having character area extracting function, recording medium saving character area extracting programs, and character area extracting method 有权
    字符区域提取装置,具有字符区域提取功能的成像装置,记录介质保存字符区域提取程序和字符区域提取方法

    公开(公告)号:US20110255785A1

    公开(公告)日:2011-10-20

    申请号:US13067133

    申请日:2011-05-11

    IPC分类号: G06K9/18

    摘要: A character area extracting device includes a reflective and non-reflective area separation unit separating image data into reflective and non-reflective areas, and binarizing the image data by changing a first threshold value when it is inappropriate; a reflective area binarizing unit separating the reflective area into character and background areas, and binarizing it by changing a second threshold value when it is inappropriate; a non-reflective area binarizing unit separating the non-reflective area into the character and background areas, and binarizing it by changing a third threshold value when it is inappropriate; a reflective and non-reflective area separation evaluation unit; and a line extracting unit connecting the character areas of the reflective and non-reflective areas and extracting positional information of the connected character areas in the image data.

    摘要翻译: 字符区域提取装置包括将图像数据分离成反射和非反射区域的反射和非反射区域分离单元,并且当不合适时通过改变第一阈值来二值化图像数据; 反射区域二值化单元,将反射区域分离成字符和背景区域,并且当不适当时通过改变第二阈值来对其进行二值化; 非反射区域二值化单元,将非反射区域分离成字符和背景区域,并且当不合适时通过改变第三阈值来二值化; 反射和非反射区域分离评估单元; 以及线提取单元,连接反射区域和非反射区域的字符区域,并提取图像数据中连接的字符区域的位置信息。

    LOGICAL STRUCTURE ANALYZING APPARATUS, METHOD, AND COMPUTER PRODUCT
    7.
    发明申请
    LOGICAL STRUCTURE ANALYZING APPARATUS, METHOD, AND COMPUTER PRODUCT 有权
    逻辑结构分析设备,方法和计算机产品

    公开(公告)号:US20090112797A1

    公开(公告)日:2009-04-30

    申请号:US12180202

    申请日:2008-07-25

    IPC分类号: G06F17/30

    CPC分类号: G06K9/00469

    摘要: A logical structure analyzing apparatus includes an extracting unit that extracts word candidates from a form, a first generating unit that classifies each of the word candidates into a group of heading candidates or a group of data candidates to generate, based on positions of the word candidates on the form, first candidate sets each including one heading candidate and one data candidate identifiable by the heading candidate, and a second generating unit that combines the first candidate sets to generate second candidate sets that each include plural heading candidates that differ and one data candidate. The apparatus also includes a removing unit that, based on positions of the heading candidates and the data word candidate in each second candidate set, removes from among the second candidate sets, a determined set including a data item and headings identifying the data item, and an output unit that outputs the determined set.

    摘要翻译: 逻辑结构分析装置包括从表单中提取词候选的提取单元,基于候选词的位置,将每个候选候选词划分成一组候选标题或一组候选数据的第一生成单元 在表格上,包括一个标题候选的第一候选集和由标题候选可识别的一个数据候选,以及组合第一候选集以产生第二候选集的第二生成单元,其中每个候选组包括不同的多个候选候选项和一个候选数 。 该装置还包括一个删除单元,其基于每个第二候选集中的候选候选标题和数据字候选的位置从第二候选集中移除包括数据项和标识数据项的标题的确定集合,以及 输出单元,其输出所确定的集合。

    Image recognition apparatus, image recognition method, and storage medium recording image recognition program
    8.
    发明授权
    Image recognition apparatus, image recognition method, and storage medium recording image recognition program 有权
    图像识别装置,图像识别方法和存储介质记录图像识别程序

    公开(公告)号:US08503784B2

    公开(公告)日:2013-08-06

    申请号:US12250302

    申请日:2008-10-13

    IPC分类号: G06K9/00

    摘要: An image recognition apparatus recognizes the correspondence between character strings and logical elements composing a logical structure in an image in which the character strings are described as the logical elements to recognize each logical element. The image recognition apparatus includes outputting means for outputting the recognized logical elements when the correspondence is recognized or re-recognized; first determining means for determining a certain logical element to be correct when input of a determination request to determine the logical element is received from a user; second determining means for determining the correctness of all the logical elements output before the logical element determined by the first determining means and is positioned according to confirmation by the user; and re-recognizing means for re-recognizing the correspondence between logical elements that have not been determined to be correct and the character strings on the basis of the determination content for each logical element.

    摘要翻译: 图像识别装置识别字符串和组成逻辑结构的逻辑元件之间的对应关系,其中描述了字符串作为识别每个逻辑元件的逻辑元件的图像。 所述图像识别装置包括:输出装置,用于当所述对应被识别或重新识别时输出所识别的逻辑元件; 第一确定装置,用于当从用户接收到确定逻辑元件的确定请求的输入时,确定某个逻辑元件是正确的; 第二确定装置,用于确定在由第一确定装置确定的逻辑元件之前输出的所有逻辑元件的正确性,并且根据用户的确认定位; 以及重新识别装置,用于基于每个逻辑元素的确定内容来重新识别尚未被确定为正确的逻辑元素与字符串之间的对应关系。

    Recording medium for recording logical structure model creation assistance program, logical structure model creation assistance device and logical structure model creation assistance method
    10.
    发明授权
    Recording medium for recording logical structure model creation assistance program, logical structure model creation assistance device and logical structure model creation assistance method 有权
    用于记录逻辑结构模型创建辅助程序,逻辑结构模型创建辅助装置和逻辑结构模型创建辅助方法的记录介质

    公开(公告)号:US08249351B2

    公开(公告)日:2012-08-21

    申请号:US12328442

    申请日:2008-12-04

    IPC分类号: G06K9/00 G06F7/00 G06F17/00

    CPC分类号: G06F17/243

    摘要: A method for assisting in the creation of a logical structure model, which stores, from an image in which character strings associated respectively with a plurality of logical elements constituting a logical structure are described, the logical elements, character strings associated with the logical elements, and the logical structure, wherein character strings in an input image and the logical structure among the character strings in the input image are extracted, a logical element is selected among the plurality of logical elements according to the degrees of similarity between the extracted character strings and the character string associated respectively with the plurality of logical elements stored in the logical structure model, a character string associated with the selected logical element and a character string in the input image associated with the logical element based on the logical structure among the extracted character strings in the input image are extracted.

    摘要翻译: 一种辅助创建逻辑结构模型的方法,该逻辑结构模型存储从其中描述了分别与构成逻辑结构的多个逻辑元件相关联的字符串的图像,逻辑元素,与逻辑元素相关联的字符串, 以及逻辑结构,其中输入图像中的字符串和输入图像中的字符串之间的逻辑结构被提取,根据提取的字符串之间的相似度和多个逻辑元素之间的相似度来选择逻辑元素, 分别与存储在逻辑结构模型中的多个逻辑元素相关联的字符串,与所选择的逻辑元素相关联的字符串和基于提取的字符串中的逻辑结构与逻辑元素相关联的输入图像中的字符串 在输入图像中提取。