摘要:
A binarization method obtains an optimum threshold value for binarization which is used when converting a multi-level image data which describes an input image into a black-and-white bi-level image data. The binarization method includes the steps of counting a first number of black picture elements by varying a threshold value from a darkest tone level to a lightest tone level, counting a second number of picture elements having a tone level other than the lightest tone level, obtaining a percentage of the first number with respect to the second number for each of the varied threshold values so as to normalize a density of the input image, and determining the optimum threshold value for binarization based on the percentage which is obtained for each of the varied threshold values.
摘要:
A binarization method obtains an optimum threshold value for binarization which is used when converting a multi-level image data which describes an input image into a black-and-white bi-level image data. The binarization method includes the steps of counting a first number of black picture elements by varying a threshold value from a darkest tone level to a lightest tone level, counting a second number of picture elements having a tone level other than the lightest tone level, obtaining a percentage of the first number with respect to the second number for each of the varied threshold values so as to normalize a density of the input image, and determining the optimum threshold value for binarization based on the percentage which is obtained for each of the varied threshold values.
摘要:
A method for recognizing a table area in a document, includes the steps of extracting image data on a table area having a table from binary image data on a document, extracting a line segment extending in a first direction from the image data on the table area, and extracting a line segment extending in a second direction perpendicular to the first direction from the image data on the table area. The method also includes the steps of determining, from the line segment extending in the first direction and the line segment extending in the second direction, whether or not ruled lines are provided on both sides of the table, and generating imaginary ruled lines on both sides of the table when it is determined that the ruled lines are provided on neither side of the table. An apparatus for recognizing a table area in a document is provided.
摘要:
An image memory stores image data of a document file. A bit map developing part converts the document file into bit map image data. A difference information extraction part compares the bit map image data with the image data so as to extract difference information representing a difference between the bit map image data and the image data. The difference information is saved as a deference information file.
摘要:
A method and apparatus for detecting the skew angle of a document image. Skew angle determination is performed by the steps of determining a set of sampling points from an input document image and processing X and Y coordinates of the sampling points in order to calculate a regression coefficient of the sampling points. The skew angle of the document is determined using the regression coefficient. To evaluate a calculated skew angle which corresponds to the regression coefficient, a correlation coefficient is calculated and evaluated. As coordinates of sampling points are obtained for a plurality of sets of data corresponding to different ruled lines or lines of characters, a histogram may be used to determine the most probable skew angle.