Method for segmentation-based recognizing handwritten touching numeral strings
    1.
    发明授权
    Method for segmentation-based recognizing handwritten touching numeral strings 有权
    一种基于分割识别手写触摸数字串的方法

    公开(公告)号:US06920246B2

    公开(公告)日:2005-07-19

    申请号:US10098457

    申请日:2002-03-18

    IPC分类号: G06K9/46 G06K9/34

    摘要: Disclosed is a method of segmenting touching numeral strings contained in handwritten touching numeral strings, and recognizing the numeral strings by use of feature information and recognized results provided by inherent structure of digits. The method comprises the steps of: receiving a handwritten numeral string extracted from a pattern document; smoothing a curved numeral image of the handwritten numeral string, and searching connecting components in the numeral image; determining whether or not the numeral string is a touching numeral string; if it is determined that the numeral string is the touching numeral string, searching a contour of the touching numeral string image; searching candidate segmentation points in the contour, and segmenting sub-images; computing a segmentation confidence value on each segmented sub-image by use of a segmentation error function to select the sub-image with the highest segmentation confidence value as a segmented numeral image in the touching numeral string image; if it is determined in the step c that the numeral string is not the touching numeral string, extracting a feature to recognize the segmented numeral image; segmenting the numeral image selected from the touching numeral string in the highest segmenting confidence value; and obtaining remaining numeral string image.

    摘要翻译: 公开了一种分割手写触摸数字串中包含的触摸数字串的方法,并且通过使用特征信息和由数字的固有结构提供的识别结果来识别数字串。 该方法包括以下步骤:接收从图案文档提取的手写数字字符串; 平滑手写数字串的曲面数字图像,并搜索数字图像中的连接分量; 确定数字串是否是触摸数字串; 如果确定数字串是触摸数字串,则搜索触摸数字串图像的轮廓; 搜索轮廓中的候选分割点,并分割子图像; 通过使用分割误差函数来计算每个分割子图像上的分割置信度值,以选择具有最高分割置信度值的子图像作为触摸数字串图像中的分割数字图像; 如果在步骤c中确定数字串不是触摸数字串,则提取特征以识别分割的数字图像; 以最高分割置信度分割从触摸数字串中选择的数字图像; 并获得剩余的数字串图像。

    Method for analyzing structure of a treatise type of document image
    2.
    发明授权
    Method for analyzing structure of a treatise type of document image 失效
    分析文献图像的论文类型结构的方法

    公开(公告)号:US06728403B1

    公开(公告)日:2004-04-27

    申请号:US09496630

    申请日:2000-02-02

    IPC分类号: G06K934

    CPC分类号: G06K9/00469

    摘要: A method for analyzing structure of a treatise type of document image in order to detect a title, an author and an abstract region and recognize the content in each of the regions is provided. In order to analyze the structure of a treatise type of document, first, the document image divided into a number of regions and the divided regions are classified into text regions and non-text regions according to attributes of the regions. And then, the candidate regions representing an abstract and an introduction is selected, thereafter word regions are extracted from the candidate regions, and an abstract content portion is determined. Thereafter, the title and the author are separated by using the basic form and the type definition representing an arrangement of each of journals. Finally, the content of the separated regions is recognized to generate said table of contents.

    摘要翻译: 提供了一种用于分析文献图像的论文结构以便检测标题,作者和抽象区域并且识别每个区域中的内容的方法。 为了分析论文类型的文档的结构,首先,根据区域的属性将分割成多个区域的文档图像和分割区域分类为文本区域和非文本区域。 然后,选择表示抽象和引入的候选区域,之后从候选区域提取字区域,并且确定抽象内容部分。 此后,标题和作者通过使用表示各期刊的排列的基本形式和类型定义分开。 最后,识别分离区域的内容以产生所述目录。

    Method for recognizing multi-language printed documents using strokes and non-strokes of characters
    3.
    发明授权
    Method for recognizing multi-language printed documents using strokes and non-strokes of characters 失效
    使用笔画和非笔画字符识别多语言印刷文档的方法

    公开(公告)号:US06665437B1

    公开(公告)日:2003-12-16

    申请号:US09484533

    申请日:2000-01-18

    IPC分类号: G06K946

    摘要: Disclosed is a method for recognizing multi-language printed documents, a method for extracting character features according to the present invention, the method comprising the steps of: a) normalizing characters to a fixed size; b) converting the size-fixed characters into mesh-type characters; c) extracting stroke features of each of the mesh-type characters; d) extracting non-stroke features of each of the mesh-type characters; and e) extracting the character features using the stroke features and the non-stroke features. The present invention provides a high recognition rate irrespective of the size and modification of the characters, by extracting the character feature from the stroke and non-stroke in the mesh block.

    摘要翻译: 公开了一种用于识别多语言印刷文件的方法,一种用于提取根据本发明的字符特征的方法,所述方法包括以下步骤:a)将字符归一化为固定大小; b)将大小固定的字符转换成网格类型的字符; c)提取每个网格型字符的笔画特征; d)提取每个网格型字符的非笔画特征; 以及e)使用笔画特征和非笔画特征提取字符特征。 通过从网格块中的笔画和非笔画中提取字符特征,本发明提供了与字符的大小和修改无关的高识别率。