Logical structure analyzing apparatus, method, and computer product
    11.
    发明授权
    Logical structure analyzing apparatus, method, and computer product 有权
    逻辑结构分析装置,方法和计算机产品

    公开(公告)号:US08010564B2

    公开(公告)日:2011-08-30

    申请号:US12180202

    申请日:2008-07-25

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06K9/00469

    摘要: A logical structure analyzing apparatus includes an extracting unit that extracts word candidates from a form, a first generating unit that classifies each of the word candidates into a group of heading candidates or a group of data candidates to generate, based on positions of the word candidates on the form, first candidates sets each including one heading candidate and one data candidate identifiable by the heading candidate, and a second generating unit that combines the first candidate sets to generate second candidate sets that each include plural heading candidates that differ and one data candidate. The apparatus also includes a removing unit that, based on positions of the heading candidates and the data word candidate in each second candidate set, removes from among the second candidates sets, a determined set including a data item and headings identifying the data item, and an output unit that outputs the determined set.

    摘要翻译: 逻辑结构分析装置包括从表单中提取词候选的提取单元,基于候选词的位置,将每个候选候选词划分成一组候选标题或一组候选数据的第一生成单元 在表格上,第一候选人设置每个包括一个候选候选人和一个可由候选候选人标识的候选数据候选的候选文件,以及第二生成单元,其组合第一候选组以生成第二候选组,每个候选组包括不同的多个标题候选和一个数据候选 。 该装置还包括一个删除单元,其基于每个第二候选集中的候选候选和候选字符的位置,从第二候选集中移除包括数据项和标识数据项的标题的确定集合,以及 输出单元,其输出所确定的集合。

    IMAGE RECOGNITION APPARATUS, IMAGE RECOGNITION METHOD, AND STORAGE MEDIUM RECORDING IMAGE RECOGNITION PROGRAM
    13.
    发明申请
    IMAGE RECOGNITION APPARATUS, IMAGE RECOGNITION METHOD, AND STORAGE MEDIUM RECORDING IMAGE RECOGNITION PROGRAM 有权
    图像识别装置,图像识别方法和存储媒体记录图像识别程序

    公开(公告)号:US20090110282A1

    公开(公告)日:2009-04-30

    申请号:US12250302

    申请日:2008-10-13

    IPC分类号: G06K9/00

    摘要: An image recognition apparatus recognizes the correspondence between character strings and logical elements composing a logical structure in an image in which the character strings are described as the logical elements to recognize each logical element. The image recognition apparatus includes outputting means for outputting the recognized logical elements when the correspondence is recognized or re-recognized; first determining means for determining a certain logical element to be correct when input of a determination request to determine the logical element is received from a user; second determining means for determining the correctness of all the logical elements output before the logical element determined by the first determining means and is positioned according to confirmation by the user; and re-recognizing means for re-recognizing the correspondence between logical elements that have not been determined to be correct and the character strings on the basis of the determination content for each logical element.

    摘要翻译: 图像识别装置识别字符串和组成逻辑结构的逻辑元件之间的对应关系,其中描述了字符串作为识别每个逻辑元件的逻辑元件的图像。 所述图像识别装置包括:输出装置,用于当所述对应被识别或重新识别时输出所识别的逻辑元件; 第一确定装置,用于当从用户接收到确定逻辑元件的确定请求的输入时,确定某个逻辑元件是正确的; 第二确定装置,用于确定在由第一确定装置确定的逻辑元件之前输出的所有逻辑元件的正确性,并且根据用户的确认定位; 以及重新识别装置,用于基于每个逻辑元素的确定内容来重新识别尚未被确定为正确的逻辑元素与字符串之间的对应关系。

    Dictionary creating apparatus, recognizing apparatus, and recognizing method

    公开(公告)号:US08379983B2

    公开(公告)日:2013-02-19

    申请号:US12385970

    申请日:2009-04-24

    IPC分类号: G06K9/62

    CPC分类号: G06K9/6255

    摘要: A dictionary creating apparatus registers probability distributions each including an average vector and a covariance matrix, in a dictionary. The dictionary creating apparatus organizes plural distribution profiles of character categories having similar feature vectors into one typical distribution profile, and registers the typical distribution profile and the character categories to be organized, associated with each other, in the dictionary, without registering eigenvalues and eigenvectors of all character categories, associated with each other, in the dictionary.

    Character recognition method, character recognition device, and computer product
    15.
    发明申请
    Character recognition method, character recognition device, and computer product 有权
    字符识别方法,字符识别装置和计算机产品

    公开(公告)号:US20080069447A1

    公开(公告)日:2008-03-20

    申请号:US11654180

    申请日:2007-01-16

    IPC分类号: G06K9/18

    CPC分类号: G06K9/346 G06K2209/01

    摘要: Upon receiving, for example, document data including a character string from outside, a character recognition device detects a line from a line-touching character-string image in which at least one character (such as number, alphabet letter, kana character, and Chinese character) touches (or overlaps) a line in the document data, tentatively removes the line, and estimates a character region. The character recognition device extracts a line-touching character image from the line-touching character-string image (original image) based on the estimated character region. The character recognition device creates a line-added reference character image by adding a quasi-line to a reference character image stored in advance.

    摘要翻译: 例如,在从外部接收到包括字符串的文档数据的情况下,字符识别装置从至少一个字符(例如号码,字母,假名字符和中文)的线条触摸字符串图像中检测线 字符)触摸(或重叠)文档数据中的一行,暂时删除该行,并估计字符区域。 字符识别装置基于估计的字符区域从线条触摸字符串图像(原始图像)中提取线条触摸字符图像。 字符识别装置通过向预先存储的参考字符图像添加准行来创建线条添加的参考文字图像。

    Character recognition processing method and apparatus
    16.
    发明授权
    Character recognition processing method and apparatus 有权
    字符识别处理方法和装置

    公开(公告)号:US08254689B2

    公开(公告)日:2012-08-28

    申请号:US12652556

    申请日:2010-01-05

    IPC分类号: G06K9/46

    CPC分类号: G06K9/627 G06K2209/01

    摘要: This method includes: extracting a feature vector for an input character from a reading result of the input character; calculating distances between the feature vector for the input character and vectors including average vectors stored in a system dictionary storing, for each character, the average vector and distribution information, and feature vectors stored in a user dictionary; extracting the top N character codes in an ascending order of the calculated distances; obtaining second distribution information for the character codes, which are included the user dictionary and in the top N character codes; calculating, for each of the top N character codes, a second distance with the feature vector for the input character, by using, for the character codes, which are included in the user dictionary and in the top N character codes, the second distribution information; and identifying a character code whose second distance is shortest.

    摘要翻译: 该方法包括:从输入字符的读取结果中提取输入字符的特征向量; 计算输入字符的特征向量之间的距离和包括存储在存储在用户字典中的每个字符的平均向量和分布信息以及特征向量的系统字典中的平均向量的向量。 以计算出的距离的升序提取前N个字符代码; 获取包括用户字典和前N个字符代码的字符代码的第二分发信息; 对于前N个字符代码中的每一个,对于输入字符的特征向量,通过使用包括在用户字典中的字符代码和前N个字符代码来计算第二分配信息 ; 并识别其第二距离最短的字符码。

    REPLAY CONTROL METHOD AND REPLAY APPARATUS
    18.
    发明申请
    REPLAY CONTROL METHOD AND REPLAY APPARATUS 审中-公开
    复印控制方法和复印装置

    公开(公告)号:US20120002944A1

    公开(公告)日:2012-01-05

    申请号:US13231623

    申请日:2011-09-13

    IPC分类号: H04N9/80

    摘要: A replay control method of controlling reply means for replaying video content executed by a computer, the method includes: accepting one or more keywords; retrieving, from pieces of correspondence information each containing fraction part information specifying a piece of video content and a fraction part in the piece of video content, and a word string expressed in the fraction part, each piece of correspondence information whose word string contains at least one of the accepted one or more keywords; and making the replay means replay the fraction part specified by each retrieved piece of correspondence information.

    摘要翻译: 一种重放控制方法,用于控制重播由计算机执行的视频内容的回复装置,所述方法包括:接受一个或多个关键字; 从包含指定一段视频内容的分数部分信息和视频内容中的分数部分的对应信息和在分数部分中表达的单词串中的每个对应信息中检索出每个对应信息,其字串至少包含 被接受的一个或多个关键字之一; 并且使重播意味着重播由每个检索到的对应信息指定的分数部分。

    CHARACTER RECOGNITION PROCESSING METHOD AND APPARATUS
    19.
    发明申请
    CHARACTER RECOGNITION PROCESSING METHOD AND APPARATUS 有权
    字符识别处理方法和装置

    公开(公告)号:US20100104192A1

    公开(公告)日:2010-04-29

    申请号:US12652556

    申请日:2010-01-05

    IPC分类号: G06K9/46

    CPC分类号: G06K9/627 G06K2209/01

    摘要: This method includes: extracting a feature vector for an input character from a reading result of the input character; calculating distances between the feature vector for the input character and vectors including average vectors stored in a system dictionary storing, for each character, the average vector and distribution information, and feature vectors stored in a user dictionary; extracting the top N character codes in an ascending order of the calculated distances; obtaining second distribution information for the character codes, which are included the user dictionary and in the top N character codes; calculating, for each of the top N character codes, a second distance with the feature vector for the input character, by using, for the character codes, which are included in the user dictionary and in the top N character codes, the second distribution information; and identifying a character code whose second distance is shortest.

    摘要翻译: 该方法包括:从输入字符的读取结果中提取输入字符的特征向量; 计算输入字符的特征向量之间的距离和包括存储在存储在用户字典中的每个字符的平均向量和分布信息以及特征向量的系统字典中的平均向量的向量。 以计算出的距离的升序提取前N个字符代码; 获取包括用户字典和前N个字符代码的字符代码的第二分发信息; 对于前N个字符代码中的每一个,对于输入字符的特征向量,通过使用包括在用户字典中的字符代码和前N个字符代码来计算第二分配信息 ; 并识别第二距离最短的字符码。

    Dictionary creating apparatus, recognizing apparatus, and recognizing method
    20.
    发明申请
    Dictionary creating apparatus, recognizing apparatus, and recognizing method 有权
    词典创建装置,识别装置和识别方法

    公开(公告)号:US20090285490A1

    公开(公告)日:2009-11-19

    申请号:US12385970

    申请日:2009-04-24

    IPC分类号: G06K9/46

    CPC分类号: G06K9/6255

    摘要: A dictionary creating apparatus registers probability distributions each including an average vector and a covariance matrix, in a dictionary. The dictionary creating apparatus organizes plural distribution profiles of character categories having similar feature vectors into one typical distribution profile, and registers the typical distribution profile and the character categories to be organized, associated with each other, in the dictionary, without registering eigenvalues and eigenvectors of all character categories, associated with each other, in the dictionary.

    摘要翻译: 词典创建装置在字典中注册各自包括平均向量和协方差矩阵的概率分布。 字典创建装置将具有类似特征向量的字符类别的多个分布简档组织成一个典型分布简档,并将典型分布简档和要组织的字符类别彼此相关联地登记在字典中,而不注册特征值和特征向量 所有字符类别,相互关联,在字典中。