METHOD AND APPARATUS FOR SEPARATING TEXT AND FIGURES IN DOCUMENT IMAGES

    公开(公告)号:US20190005324A1

    公开(公告)日:2019-01-03

    申请号:US16022016

    申请日:2018-06-28

    Abstract: A method and apparatus for separating a text and figure of a document image are provided. The method of separating the text and the figure of the document image includes acquiring a document image, dividing the document image into a plurality of regions of interest, acquiring a feature vector by using a two-dimensional (2D) histogram by resizing the regions of interest and extracting a connection component of the regions of interest, acquiring a transformation vector of the feature vector by using a kernel, obtaining a cluster center of the transformation vector, and performing clustering on the cluster center to acquire a supercluster, and classifying the supercluster into one of a text class and a figure class, based on the number of superclusters.

Patent Agency Ranking