METHOD AND APPARATUS FOR PROCESSING AN IMAGE COMPRISING CHARACTERS
    71.
    发明申请
    METHOD AND APPARATUS FOR PROCESSING AN IMAGE COMPRISING CHARACTERS 有权
    用于处理包含字符的图像的方法和装置

    公开(公告)号:US20120063687A1

    公开(公告)日:2012-03-15

    申请号:US13156688

    申请日:2011-06-09

    IPC分类号: G06K9/46

    CPC分类号: G06K9/6814 G06K9/6224

    摘要: Method and apparatus for processing an image including a character are disclosed. The method may include: searching in a set of characters one or more characters having highest similarities of shape to a character in the set of characters, hereinafter the character being referred to as a first character, the one or more searched characters forming a similar character list of the first character; searching in the set of characters one or more characters having highest similarities of shape to each character in the similar character list of the first character, to form a similar character list of each character in the similar character list of the first character; and selecting in the similar character lists one or more characters having a high mutual similarity between each other, as a character cluster.

    摘要翻译: 公开了用于处理包括字符的图像的方法和装置。 该方法可以包括:在一组字符中搜索具有与该组文字中的字符具有最高相似度的一个或多个字符,此后字符被称为第一个字符,该一个或多个搜索到的字符形成相似的字符 第一个字符的列表; 在所述一组字符中搜索与所述第一字符的相似字符列表中的每个字符具有最高相似度形状的一个或多个字符,以形成所述第一字符的相似字符列表中每个字符的相似字符列表; 并且在相似的字符中选择一个或多个彼此之间具有高相互相似性的字符作为字符簇。

    Document image processing method and apparatus
    72.
    发明申请
    Document image processing method and apparatus 有权
    文件图像处理方法和装置

    公开(公告)号:US20120045129A1

    公开(公告)日:2012-02-23

    申请号:US13067247

    申请日:2011-05-18

    IPC分类号: G06K9/34

    摘要: A method for processing a document image includes: performing horizontal and vertical text line extraction on the document image; providing an overlapping matrix, a value of an element of the overlapping matrix indicating an overlapping relation between horizontal and vertical text lines; merging the overlapping matrix in the vertical and horizontal direction; determining one or more text overlapping regions in the document image, based on the values of the elements of the merged overlapping matrix; counting the total number of strokes or pixel points in the horizontal and vertical text lines, respectively, within one of the one or more text overlapping regions; and determining an orientation of the text overlapping region is horizontal if the total number of strokes or pixel points in the horizontal text lines is larger than that in the vertical text lines, otherwise, determining the orientation is vertical.

    摘要翻译: 一种用于处理文档图像的方法包括:在文档图像上执行水平和垂直文本行提取; 提供重叠矩阵,所述重叠矩阵的元素的值指示水平和垂直文本行之间的重叠关系; 在垂直和水平方向上合并重叠矩阵; 基于所述合并的重叠矩阵的元素的值来确定所述文档图像中的一个或多个文本重叠区域; 在一个或多个文本重叠区域之一内分别计算水平和垂直文本行中的笔画或像素点的总数; 并且如果水平文本行中的笔画或像素点的总数大于垂直文本行中的大小,则确定文本重叠区域的取向是水平的,否则确定方向是垂直的。

    Degraded character image generation method and apparatus
    73.
    发明授权
    Degraded character image generation method and apparatus 失效
    降级字符图像生成方法和装置

    公开(公告)号:US07480409B2

    公开(公告)日:2009-01-20

    申请号:US11200202

    申请日:2005-08-10

    IPC分类号: G06K9/34 G06K9/32 G06K9/62

    摘要: A method and apparatus for generating a degraded character image at various levels of degradation automatically is presented in this invention. The method comprises rendering the character image on a scene plane; translating and rotating the scene plane according to various parameters; determining a projection region of the character image on an image plane according to various parameters; generating a pixel region mask; and generating a final degraded image by super-sampling. Thus various degraded character images are generated on various conditions of degradation. The generated synthetic characters can be used for performance evaluation and training data augmentation in optical character recognition (OCR).

    摘要翻译: 本发明提供了一种用于在自动降级的各种劣化级别生成降级字符图像的方法和装置。 该方法包括:将场景平面上的人物图像渲染; 根据各种参数平移和旋转场景平面; 根据各种参数确定图像平面上的字符图像的投影区域; 生成像素区域掩模; 并通过超采样生成最终退化图像。 因此,在各种劣化条件下产生各种退化的字符图像。 所生成的合成字符可用于光学字符识别(OCR)中的性能评估和训练数据增加。

    Device processing a table image, a memory medium storing a processing program, and a table management processing method
    74.
    发明授权
    Device processing a table image, a memory medium storing a processing program, and a table management processing method 失效
    设备处理表格图像,存储处理程序的存储介质和表管理处理方法

    公开(公告)号:US07133558B1

    公开(公告)日:2006-11-07

    申请号:US09410626

    申请日:1999-10-01

    IPC分类号: G06K9/48 G06K9/34

    CPC分类号: G06K9/00449

    摘要: A table image processing device processes a table image and a memory medium stores a processing program. The table image processing device processes precisely a table image containing a round corner and includes a device extracting a line extracting a longitudinal line and lateral line out of an input image, a device finding a potential match of a round corner region extracting an oblique line which commences from a terminal of a line found by the line extracting device, and finding the potential match of the round corner region, a device extracting a cell containing the potential match of the round corner found by the potential match of the round corner region finding device, and a device deciding the round corner part deciding the round corner from the cells found by the device extracting the cells.

    摘要翻译: 表格图像处理装置处理表格图像,存储介质存储处理程序。 表格图像处理装置精确地处理包含圆角的表格图像,并且包括提取从输入图像提取纵向线和横向线的线的装置,找到提取斜线的圆角区域的潜在匹配的装置, 从线提取装置找到的线的终端开始,并找到圆角区域的潜在匹配,提取包含通过圆角区域查找装置的电位匹配发现的圆角的可能匹配的单元的装置 以及确定从提取单元的设备发现的小区决定圆角的圆角部分的设备。

    Precise grayscale character segmentation apparatus and method
    75.
    发明申请
    Precise grayscale character segmentation apparatus and method 有权
    精确的灰度字符分割装置和方法

    公开(公告)号:US20060245650A1

    公开(公告)日:2006-11-02

    申请号:US11356449

    申请日:2006-02-17

    IPC分类号: G06K9/34

    摘要: Precise grayscale character segmentation apparatus and method. The precise grayscale character segmentation apparatus comprises an adjustment and segmentation unit for adjusting and segmenting an inputted low resolution text line image undergone coarse segmentation, so as to generate an adjusted character image; a character image binarization unit for generating a binary character image from the character image inputted therein; a noise removal unit for removing noise information in the binary character image generated by the binarization unit; and a final character image segmentation unit for generating a precisely segmented character image from the binary character image from which noise has been removed.

    摘要翻译: 精确的灰度字符分割装置和方法。 精确的灰度字符分割装置包括调整和分割单元,用于调整和分割经粗分割的输入低分辨率文本行图像,以便产生经调整的字符图像; 字符图像二值化单元,用于从输入的字符图像生成二进制字符图像; 噪声去除单元,用于去除由二值化单元生成的二进制字符图像中的噪声信息; 以及最终字符图像分割单元,用于从已经去除噪声的二进制字符图像生成精确分割的字符图像。

    Grayscale character dictionary generation apparatus

    公开(公告)号:US20060171589A1

    公开(公告)日:2006-08-03

    申请号:US11329407

    申请日:2006-01-11

    IPC分类号: G06K9/34 G06K9/18

    摘要: A grayscale character dictionary generation apparatus, comprising a first synthetic grayscale degraded character image generation unit for generating first synthetic grayscale degraded character images using binary character images inputted therein; a clustering unit for dividing each category of the first synthetic grayscale degraded character images generated by the first synthetic grayscale degraded character image generation unit into a plurality of clusters; a template generation unit for generating template for each of the clusters; a transformation matrix generation unit for generating transformation matrix in relation to each of the templates; and a second synthetic grayscale degraded character dictionary generation unit for obtaining character feature of every grayscale degraded character of each of the clusters using the transformation matrix, and for constructing eigenspace of each category of the synthetic grayscale degraded character, which is the second synthetic grayscale character dictionary.

    Degraded dictionary generation method and apparatus

    公开(公告)号:US20060056696A1

    公开(公告)日:2006-03-16

    申请号:US11200194

    申请日:2005-08-10

    IPC分类号: G06K9/18

    CPC分类号: G06K9/6255

    摘要: A method and apparatus for generating a degraded dictionary automatically is presented in this invention. Herein, a degraded pattern generating means generates a plurality of degraded patterns from an original character image, based on a plurality of degradation parameters. A degraded dictionary generating means generates a plurality of degraded dictionaries corresponding to the plurality of degradation parameters, based on the plurality of degradation patterns. Finally, a dictionary matching means selects one of the plurality of dictionaries which matches the degradation level of a test sample set best, as the final degraded dictionary. In this invention, various degraded patterns can be generated by means of simple scaling and blurring process for establishing degraded dictionaries. Therefore, the invention can be implemented simply and easily. The method and apparatus of the invention can not only be used in character recognition field, but also can be used in other fields such as speech recognition and face recognition.

    Image processing apparatus
    78.
    发明授权

    公开(公告)号:US6141435A

    公开(公告)日:2000-10-31

    申请号:US681485

    申请日:1996-07-23

    摘要: An image processing apparatus for extracting the specified objects has a background image extract unit for extracting a background; a first average background extract unit which extracts an image that includes a plurality of stationary and moving objects each having a speed not higher than a predetermined first speed and also the background; a second average background extract unit which extracts an image that includes the stationary and moving objects each having a speed not higher than a predetermined second speed and also the background; a first difference-calculation processing unit which calculates a difference between an output from the background image extract unit and an output from the first average background extract unit as a first speed image; a second difference-calculation processing unit which calculates a difference value between two outputs from the first and second average background extract units as a second speed image; and a third difference-calculation processing unit which calculates a difference value between an original image and either one of outputs from the first and second average background extract units as a third speed image.

    Title extracting apparatus for extracting title from document image and
method thereof
    79.
    发明授权
    Title extracting apparatus for extracting title from document image and method thereof 失效
    用于从文件图像中提取标题的标题提取装置及其方法

    公开(公告)号:US06035061A

    公开(公告)日:2000-03-07

    申请号:US694503

    申请日:1996-08-07

    IPC分类号: G06K9/20 G06T11/60 G06K9/34

    CPC分类号: G06K9/00469

    摘要: A title extracting apparatus scans black pixels in a document image and extracts rectangular regions that circumscribe connected regions of the black pixels as character rectangles. In addition, the title extracting apparatus unifies a plurality of character rectangles that adjoin and extracts rectangular regions that circumscribe the character rectangles as character string rectangles. Thereafter, the title extracting apparatus calculates points with the likelihood of being a title corresponding to attributes such as an underline attribute, a frame attribute, and a ruled line attribute of each character string rectangle, the positions of the character string rectangles in the document image, and the mutual position relation and extracts a character string rectangle with the highest points as a title rectangle. In the case of a tabulated document, the title extracting apparatus can extract a title rectangle from the inside of the table. Characters extracted from the title rectangle are used as keywords of a document image by the character recognizing process.

    摘要翻译: 标题提取装置扫描文档图像中的黑色像素,并提取将黑色像素的连接区域限定为字符矩形的矩形区域。 此外,标题提取装置将邻接并提取将字符矩形包围的矩形区域的多个字符矩形统一为字符串矩形。 此后,标题提取装置算出具有与每个字符串矩形的下划线属性,框架属性和划线属性等属性对应的标题的点的点,文档图像中的字符串矩形的位置 ,和相互位置关系,并提取具有最高点的字符串矩形作为标题矩形。 在列表文档的情况下,标题提取装置可以从表的内部提取标题矩形。 从标题矩形提取的字符通过字符识别处理被用作文档图像的关键字。

    Method and apparatus for assigning temporary and true labels to digital
image
    80.
    发明授权
    Method and apparatus for assigning temporary and true labels to digital image 失效
    将临时和真实标签分配给数字图像的方法和装置

    公开(公告)号:US5937091A

    公开(公告)日:1999-08-10

    申请号:US921318

    申请日:1997-08-29

    摘要: A method and apparatus for assigning a temporary label to each connected area in an image by scanning the image by using a window which has a size of two pixels in the vertical direction and of a plurality of pixels in the horizontal direction. A set of values of pixels contained in the above window is obtained and one of predetermined temporary label assignment rules corresponding to the obtained set of pixel values is selected. A temporary label is assigned to each pixel contained in the window, based on the above one of the temporary label assignment rules determined as above, and based on temporary labels of pixels in the second group in the window at the above each location. In addition, the temporary labels are converted to true labels, by scanning the image pixel within the at least one circumscribing area only, where each circumscribing area is predetermined so that the at least one circumscribing area contains all pixels which do not belong to a background area in the image.

    摘要翻译: 一种用于通过使用在垂直方向上具有两个像素的大小和在水平方向上的多个像素的窗口扫描图像来将临时标签分配给图像中的每个连接区域的方法和装置。 获得包含在上述窗口中的一组像素值,并且选择与获得的像素值集合对应的预定临时标签分配规则之一。 基于上述确定的上述临时标签分配规则,并且基于上述每个位置的窗口中的第二组中的像素的临时标签,将临时标签分配给包含在窗口中的每个像素。 另外,临时标签被转换为真实的标签,通过仅扫描至少一个限定区域内的图像像素,其中每个限定区域是预定的,使得至少一个限定区域包含不属于背景的所有像素 图像中的区域。