摘要:
A method of indexing images contained in scanned documents, wherein said scanned documents are stored in a repository, includes: for each document to be stored in the repository, dividing the document into a plurality of sections; scanning the plurality of sections; segmenting each scanned segment according to a predetermined coding model into image segment and non-image segments; associating each of the image segments with the document; and generating an index correlating the image segments with the document. The method may further include, at the time of image recall, displaying the index of image segments in a user interface; and responsive to selection of an image segment from the index, displaying the document information associated with the image segment in the user interface.
摘要:
A method of indexing images contained in scanned documents, wherein said scanned documents are stored in a repository, includes: for each document to be stored in the repository, dividing the document into a plurality of sections; scanning the plurality of sections; segmenting each scanned segment according to a predetermined coding model into image segment and non-image segments; associating each of the image segments with the document; and generating an index correlating the image segments with the document. The method may further include, at the time of image recall, displaying the index of image segments in a user interface; and responsive to selection of an image segment from the index, displaying the document information associated with the image segment in the user interface.
摘要:
A method of indexing images contained in scanned documents, wherein said scanned documents are stored in a repository, includes: for each document to be stored in the repository, dividing the document into a plurality of sections; scanning the plurality of sections; segmenting each scanned segment according to a predetermined coding model into image segment and non-image segments; associating each of the image segments with the document; and generating an index correlating the image segments with the document. The method may further include, at the time of image recall, displaying the index of image segments in a user interface; and responsive to selection of an image segment from the index, displaying the document information associated with the image segment in the user interface.
摘要:
A method of indexing images contained in scanned documents, wherein said scanned documents are stored in a repository, includes: for each document to be stored in the repository, dividing the document into a plurality of sections; scanning the plurality of sections; segmenting each scanned segment according to a predetermined coding model into image segment and non-image segments; associating each of the image segments with the document; and generating an index correlating the image segments with the document. The method may further include, at the time of image recall, displaying the index of image segments in a user interface; and responsive to selection of an image segment from the index, displaying the document information associated with the image segment in the user interface.
摘要:
What is disclosed is a system and method for performing a background deletion that exploits both local and global context to remove background and other white space between objects with the aim of retaining structural relationships between objects in the document. A document image is received and seams are carved through the image. Seams composed of uniform background pixels are identified. Adjacent seams containing background pixels are collected into groups of seams. The background seam groups are classified according to their widths. A target number of seams to be removed for each background seam group is then determined based on the classification. Seam groups which are wider will have at least the same or a greater target number of seams to be deleted therefrom than will seam groups of narrower widths. The document image is then resized by deleting seams from the seam groups based on the assigned target number.
摘要:
Systems and methods are described that facilitate determining an original document format for a scanned document by analyzing a bitmap thereof. Text objects are extracted from the document, binarized, and segmented to identify text. Page orientation and text size are used to distinguish between a slideshow-type document, and a word processing or spreadsheet-type document. To further distinguish between the word processing and spreadsheet types, text column structure and count is analyzed.
摘要:
What is disclosed is a resizing method that utilizes segmentation information to classify objects found within a document and then selects the most appropriate resizing technique for each identified object. The present method employs readily available document parsers to reliably extract objects. e.g. text, background, images, graphics, etc., which compose the document. Information obtained from a document parser is utilized to identify the document components for classification. The extracted objects are then classified according to their object type. Each of classified objects are then resized using a resizing technique having been pre-selected for the object type based on their respective abilities to resize certain types of document content over other resizing techniques. The present method advantageously extends smart or content-based scaling and is especially useful for N-up or variable-information printing. The present method finds its intended uses in enhancing N-up and handout options currently provided in a variety of print-drivers.
摘要:
What is disclosed is a resizing method that utilizes segmentation information to classify objects found within a document and then selects the most appropriate resizing technique for each identified object. The present method employs readily available document parsers to reliably extract objects. e.g. text, background, images, graphics, etc., which compose the document. Information obtained from a document parser is utilized to identify the document components for classification. The extracted objects are then classified according to their object type. Each of classified objects are then resized using a resizing technique having been pre-selected for the object type based on their respective abilities to resize certain types of document content over other resizing techniques. The present method advantageously extends smart or content-based scaling and is especially useful for N-up or variable-information printing. The present method finds its intended uses in enhancing N-up and handout options currently provided in a variety of print-drivers.
摘要:
What is disclosed is a system and method for performing a background deletion that exploits both local and global context to remove background and other white space between objects with the aim of retaining structural relationships between objects in the document. A document image is received and seams are carved through the image. Seams composed of uniform background pixels are identified. Adjacent seams containing background pixels are collected into groups of seams. The background seam groups are classified according to their widths. A target number of seams to be removed for each background seam group is then determined based on the classification. Seam groups which are wider will have at least the same or a greater target number of seams to be deleted therefrom than will seam groups of narrower widths. The document image is then resized by deleting seams from the seam groups based on the assigned target number.
摘要:
In an input scanning system, as would be present in a digital copier, a “template” of similar visual elements or objects, such as logos and other designs, is detected among a series of scanned images. The common objects form a reference image against which subsequently-recorded input images are compared. If bounding boxes around objects in the input images match those in the reference image, the objects in the bounding boxes are attempted to be matched to those in the reference image. If objects in the input image and reference image match, then the image data from the input image is coded using a pointer to the corresponding object in the reference image.