Abstract:
A method of controlling a scanner to improve automatic recognition and classification of scanned physical documents for a document analysis system, which receives and processes jobs containing at least one electronic document from a plurality of users to automatically recognize and classify the job documents into document categories, is disclosed. The method comprises, using a scan control system, obtaining the capability of, and existing scanner settings for, the scanner upon receiving a command to initiate scanning of physical documents; saving the existing scanner settings of the scanner; automatically commanding the scanner to use new scanner settings, wherein the new scanner settings are selected in accordance with the capability of the recognition system; commanding the scanner to begin scanning operation with the new scanner settings; and automatically resetting the scanner settings of the scanner back to the saved existing scanner settings upon completing of the scanning operation.
Abstract:
Methods and systems for providing fine grain call admission control into a communication network are disclosed. Under some embodiments, the fine grain control maximizes profitability of calls serviced by the network on a call class basis.
Abstract:
In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically pre-processing each received electronic document using a plurality of image transformation algorithms to improve subsequent data extraction from said document is provided. The method includes: electronically partitioning each received electronic document page into pieces; automatically processing each piece of the received electronic document page using each of a plurality of image pre-processing algorithms to produce a plurality of image variations of each piece; and analyzing the outputs of subsequent processing and data extraction, on each of the image variations of the pieces to determine which output is best, from the plurality of outputs for each piece.
Abstract:
A method of training a document analysis system to extract data from documents is provided. The method includes: automatically analyzing images and text features extracted from a document to associate the document with a corresponding document category; comparing the extracted text features with a set of text features associated with corresponding category of the document, in which the set of text features includes a set of characters, words, and phrases; if the extracted features are found to consist of the characters, words, and phrases belonging to the set of text features associated with the corresponding document category, storing the extracted text features as the data contained in the corresponding document; and, if the extracted text features are found to include at least one text feature that does not belong to the set of text features associated with the corresponding document category, submitting the unrecognized text features to a training phase.
Abstract:
A method of grouping electronic document pages of a job that belong together is provided. The method includes: automatically analyzing images and text features extracted from each received electronic document page to associate the electronic document page with a corresponding document category; automatically identifying features extracted from the electronic document page that potentially indicate to which document group the electronic document page belongs; comparing the identified features with a set of group identifying features associated with corresponding document group, in which the set of group identifying features includes at least a set of page numbers and account numbers; and, if the identified features are found to include a set of a page number and an account number belonging to the set of group identifying features associated with the corresponding document group, grouping the electronic document page into the corresponding document group.
Abstract:
A double hull marine vessel is provided which includes a syntactic foam-macrosphere composition between the inner and outer hulls which dissipates force applied to an outer hull.
Abstract:
A method of enhancing electronic documents received from a plurality of users by a document analysis system for improving automatic recognition and classification of the received electronic documents, is provided. For each page of a received electronic document, the method filters the page to infer binarized-background artifacts resulting from the binarization of the original grayscale or color image source document and which reside in the vicinity of binarized text and binarized image features in the page, so that the binarized text and binarized images may be distinguished from the binarized-background artifacts and extracted from the document. The method then uses the extracted features from the filtered document to automatically recognized and classify a document into a document category.
Abstract:
A method of automatically narrowing data search space and improving accuracy of data extraction using known constraints in a layout of extracted data elements for classified documented is provided. The method includes: analyzing each document to classify it within a document category, each category having a corresponding set of expected layouts; analyzing each electronic document to automatically extract images and text features; automatically constructing a data structure including a layout of the extracted features and layout relationships amongst the extracted features, wherein each of the extracted features in the layout maintains a reference to neighboring features and wherein closely related features are merged to form a combined feature; automatically narrowing data search space by detecting and removing parts of the layout that are not associated with any data elements using the data structure; and automatically detecting data using the extracted feature layout and the layout relationships amongst the extracted features.
Abstract:
In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically pre-processing each received electronic document using a plurality of image transformation algorithms to improve subsequent data extraction from said document is provided. The method includes: electronically partitioning each received electronic document page into pieces; automatically processing each piece of the received electronic document page using each of a plurality of image pre-processing algorithms to produce a plurality of image variations of each piece; and analyzing the outputs of subsequent processing and data extraction, on each of the image variations of the pieces to determine which output is best, from the plurality of outputs for each piece.
Abstract:
A method of enhancing electronic documents received from a plurality of users by a document analysis system for improving automatic recognition and classification of the received electronic documents, is provided. For each page of a received electronic document, the method filters the page to infer binarized-background artifacts resulting from the binarization of the original grayscale or color image source document and which reside in the vicinity of binarized text and binarized image features in the page, so that the binarized text and binarized images may be distinguished from the binarized-background artifacts and extracted from the document. The method then uses the extracted features from the filtered document to automatically recognized and classify a document into a document category.