-
公开(公告)号:US09965871B1
公开(公告)日:2018-05-08
申请号:US15396116
申请日:2016-12-30
Applicant: KONICA MINOLTA LABORATORY U.S.A., INC.
CPC classification number: G06T7/90 , G06T5/40 , G06T2207/10024
Abstract: An image encoded with character information can be created by binarizing an input image followed by connected component labeling, and then repeating the binarization and connected component labeling on an inverted version of the input image. This results in identification of connected components. Related connected components are arranged in a family tree in which successive generations of the connected components alternate between two tree layer classifications. One of the tree layer classifications is selected based on whether certain connected components define characters. A label image is created which includes labels for the connected components except for the connected components in the selected tree layer classification.
-
公开(公告)号:US10586125B2
公开(公告)日:2020-03-10
申请号:US15719343
申请日:2017-09-28
Applicant: KONICA MINOLTA LABORATORY U.S.A., INC.
Inventor: Bo Li
Abstract: Complete removal of an underline which intersects a character may cause problems in a subsequent character recognition or conversion process, when parts of the character which coincided with the underline are also removed. To help reduce the problems, parts of underline may be removed from an image while parts of the character that coincide with the underline are maintained in the image. Areas where the character coincides with the underline are defined from a reduced version of the underline. When the underline is removed, the areas where the character coincide with the underline are maintained in a second image. The second image may then be subjected to a character recognition or conversion process with potentially fewer problems.
-