METHOD FOR LINE AND WORD SEGMENTATION FOR HANDWRITTEN TEXT IMAGES

    公开(公告)号:US20180089525A1

    公开(公告)日:2018-03-29

    申请号:US15279979

    申请日:2016-09-29

    Inventor: Duanduan Yang

    CPC classification number: G06K9/346 G06K9/00859 G06K9/344 G06K9/72 G06K2209/01

    Abstract: A method for segmenting an image containing handwritten text into line segments and word segments. The image is horizontally down sampled at a first ratio. Connected regions in the down-sampled image are detected; horizontal neighboring ones are merged to form lines, to segment the original image into line images. Each line image is horizontally down sampled at a second ratio which is smaller than the first ratio. Connected regions in the down-sampled line image are detected to obtain potential word segmentation positions. A path is a way of dividing the line at some or all of the potential word segmentation positions into multiple path segments; for each of all possible paths, word recognition is applied to each path segment to calculate a word recognition score, and an average word recognition score for the path is calculated; the path with the highest score gives the final word segmentation.

    METHOD FOR LINE AND WORD SEGMENTATION FOR HANDWRITTEN TEXT IMAGES

    公开(公告)号:US20180330181A1

    公开(公告)日:2018-11-15

    申请号:US16043010

    申请日:2018-07-23

    Inventor: Duanduan Yang

    Abstract: A method for segmenting an image containing handwritten text into line segments and word segments. The image is horizontally down sampled at a first ratio. Connected regions in the down-sampled image are detected; horizontal neighboring ones are merged to form lines, to segment the original image into line images. Each line image is horizontally down sampled at a second ratio which is smaller than the first ratio. Connected regions in the down-sampled line image are detected to obtain potential word segmentation positions. A path is a way of dividing the line at some or all of the potential word segmentation positions into multiple path segments; for each of all possible paths, word recognition is applied to each path segment to calculate a word recognition score, and an average word recognition score for the path is calculated; the path with the highest score gives the final word segmentation.

    Method for line and word segmentation for handwritten text images

    公开(公告)号:US10062001B2

    公开(公告)日:2018-08-28

    申请号:US15279979

    申请日:2016-09-29

    Inventor: Duanduan Yang

    Abstract: A method for segmenting an image containing handwritten text into line segments and word segments. The image is horizontally down sampled at a first ratio. Connected regions in the down-sampled image are detected; horizontal neighboring ones are merged to form lines, to segment the original image into line images. Each line image is horizontally down sampled at a second ratio which is smaller than the first ratio. Connected regions in the down-sampled line image are detected to obtain potential word segmentation positions. A path is a way of dividing the line at some or all of the potential word segmentation positions into multiple path segments; for each of all possible paths, word recognition is applied to each path segment to calculate a word recognition score, and an average word recognition score for the path is calculated; the path with the highest score gives the final word segmentation.

    Data normalization for handwriting recognition

    公开(公告)号:US10025976B1

    公开(公告)日:2018-07-17

    申请号:US15393056

    申请日:2016-12-28

    Abstract: Disclosed herein is a method of optimizing data normalization by selecting the best height normalization setting from training RNN (Recurrent Neural Network) with one or more datasets comprising multiple sample images of handwriting data, which comprises estimating a few top place ratios for normalization by minimizing a cost function for any given sample image in the training dataset, and further, determining the best ratio from the top place ratios by validating the recognition results of sample images with each top place ratio.

    Path score calculating method for intelligent character recognition

    公开(公告)号:US09977976B2

    公开(公告)日:2018-05-22

    申请号:US15196368

    申请日:2016-06-29

    Inventor: Duanduan Yang

    Abstract: Disclosed herein is a method that improves the performance of handwriting recognition by calculating path scores so as to identify the path with the highest score as the basis for interpreting handwritten characters. Specifically, the method comprises the following steps: detecting connected regions in an input image comprising handwritten characters; determining a plurality of segmentation positions of the input image; obtaining a plurality of recognition results for each segment of each path in the input image, wherein each recognition result represents a character candidate for the segment and each path comprises one or more segments; obtaining a plurality of scores corresponding to the recognition results; calculating scores for each path in the input image based on segment lengths and the scores corresponding to the recognition results; and using the path with the highest score to interpret the handwritten characters in the input image.

    PATH SCORE CALCULATING METHOD FOR INTELLIGENT CHARACTER RECOGNITION

    公开(公告)号:US20180005058A1

    公开(公告)日:2018-01-04

    申请号:US15196368

    申请日:2016-06-29

    Inventor: Duanduan Yang

    Abstract: Disclosed herein is a method that improves the performance of handwriting recognition by calculating path scores so as to identify the path with the highest score as the basis for interpreting handwritten characters. Specifically, the method comprises the following steps: detecting connected regions in an input image comprising handwritten characters; determining a plurality of segmentation positions of the input image; obtaining a plurality of recognition results for each segment of each path in the input image, wherein each recognition result represents a character candidate for the segment and each path comprises one or more segments; obtaining a plurality of scores corresponding to the recognition results; calculating scores for each path in the input image based on segment lengths and the scores corresponding to the recognition results; and using the path with the highest score to interpret the handwritten characters in the input image.

Patent Agency Ranking