Abstract:
A table and flowchart detection method is disclosed. First, based on connected component analysis and the sizes of the connected components, a target connected component that corresponds to possible elements of table or flowchart is detected in the input image. The target connected component is broken into corners and edges that connect the corners. Based on the relationship between the corners and edges, it is determined whether the target connected component is a table or a flowchart. For table detection, the edges and corners are linked into horizontal sets and vertical sets, and based on corner counts in the horizontal sets and vertical sets, it is determined whether the target connected component is a table. For flowchart detection, the boundary boxes and connecting lines between boundary boxes are detected to determine whether the target connected component is a flowchart.
Abstract:
In a document image segmentation method, pixels of the image are classified into different types such as background, text, table, etc., to generate an initial segmentation map. The initial segmentation map is processed multiple rounds. In each round, a working map is divided into 2×2 pixel blocks; based on pixel types in the block, a corresponding pixel in a combined map is assigned a type, and pixels in a corresponding block in the segmentation map are modified either to change some background pixels to other types or keep them unchanged. The initial segmentation map is used as the working map in the first round, and the combined map of the last round is used as the working map for the next round. After a number of rounds, remaining background pixels of the segmentation map are changed to other types based on the types of their neighboring areas.
Abstract:
A method for image processing, including: obtaining an image including a writing board and a background external to the writing board; detecting a plurality of lines within the image; determining, based on the plurality of lines, a plurality of corners of the writing board within the image; and correcting a perspective of the writing board by applying a transformation to the image based on the plurality of corners.
Abstract:
A 2D color barcode decoding method is disclosed. The barcode includes a 2D array of data cells, corner locators, and border reference cells. Each data cell and reference cell has one of four primary colors (e.g. CMYK or CMWK). The reference cells, which have known colors, are used to calculate the channel offset (a spatial offset) of each primary color and the reference color values of each primary color. The reference cells are also used to calculate a color conversion matrix between color intensity (RGB) values and the primary color. Pixel-color probabilities are calculated from the pixel color intensity values using the color conversion matrix. The color of each data cell is determined using the pixel-color probabilities, the pixel color intensity values, the reference color intensity values, and the channel offset.
Abstract:
A 2D color barcode layout is disclosed. The barcode includes a 2D array of data cells, four corner locators, and border reference cells forming four borders between the corner locators that substantially surround the array of data cells. Each data cell and border reference cell has one of four primary colors (e.g. CMYK). Most border reference cells have the same size as the data cells, except for yellow ones which are longer. The border reference cells form a repeating color sequence along the borders, and are used during decoding to calculate (1) the channel offset (a spatial offset) of each primary color at different locations along the borders and (2) the reference (average) color values of each primary color. During decoding, the color values of each data cell is measured while taking into account channel offset which is calculated by interpolating the channel offset of the border reference cells.
Abstract:
Methods for removing vertical and horizontal lines from a document image. The horizontal line removal method includes: for a column of black pixels at each horizontal position along the line, removing them if their maximum stroke width is less than the median value of maximum stroke widths in a small window centered at that horizontal position; Remove connected components remaining in the horizontal line bounding box that do not extend significantly above or below the bounding box boundaries; and Perform closing operation to join broken pieces of character strokes caused by underline removal. This method preserves character strokes while removing underlines. The vertical line removal method includes: for vertical lines that have large height to width ratio, remove parts of such lines that are not at intersection of with horizontal or near-horizontal lines; remove all remaining connected components that touch neither left nor right boundary of the bounding box.