摘要:
A method for removal of punched hole artifacts in digital images includes, for a scanned document page, deriving an original digital image that defines the page in terms of a plurality of input pixels. A reduced resolution bitonal image is generated from the original image. The method further includes providing for identifying of candidate punched hole artifacts in the reduced resolution bitonal image and providing for testing the candidate punched hole artifacts for at least one of shape, size, and location. Where a candidate punched hole artifact meets the at least one test, the method includes generating a modified image. This includes erasing the candidate punched hole artifact from the original digital image.
摘要:
A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.
摘要:
A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.
摘要:
A method for removal of punched hole artifacts in digital images includes, for a scanned document page, deriving an original digital image that defines the page in terms of a plurality of input pixels. A reduced resolution bitonal image is generated from the original image. The method further includes providing for identifying of candidate punched hole artifacts in the reduced resolution bitonal image and providing for testing the candidate punched hole artifacts for at least one of shape, size, and location. Where a candidate punched hole artifact meets the at least one test, the method includes generating a modified image. This includes erasing the candidate punched hole artifact from the original digital image.
摘要:
A method for varying the color of an image including lines and background. Where the image includes the colors black and white and a plurality of gray pixels, where gray refers to the presence of pixel values between the maximum and minimum pixel values, inclusive, the image is first converted to a color space, such as for example, r, g, b (red-green-blue). Pixel values are thresholded for differentiation between lines and background. When pixels have a value indicating that the pixel is background, that pixel is set to a background color that has been previously selected. Otherwise, that pixel is set to a foreground color. The result is that background is set to a single color, and lines are set to a second color. Alternatively, where intermediate values are present, the foreground color value may be added to the intermediate level color value to produce a gradually varying colored line.
摘要:
A system and method are disclosed for an image processing system including a threaded scheduler providing compact and efficient dataflow as a pipeline management and data flow layer.
摘要:
The disclosed embodiment relates to methods and systems for evaluating an electronic document. The computer implemented method includes receiving the electronic document containing a first set of answers corresponding to one or more pre-stored questions. The first set of answers are compared with a pre-stored second set of answers based on an answer descriptor syntax dataset. The answer descriptor syntax dataset comprises one or more rules. One or more answer descriptors for each of the first set of answers are determined based on the comparing. The one or more answer descriptors correspond to one or more observations for each of the first set of answers. Finally, the electronic document is evaluated based on determining.
摘要:
An image file representing at least a portion of a printed document is processed to highlight the differences between foreground material (e.g., text or other characters) from background. The method includes selecting a neighborhood of pixels, determining a weighted average of an attribute values (e.g., luminance) for each pixel, and modifying each pixel's value based on the weighted average. Graylevel scaling, error diffusion, and a bit level conversion are also performed each pixel ends up with either a first attribute value level (e.g., luminance of 0) or a second attribute value level (e.g., luminance of 255).
摘要:
A method and system for processing a digital assessment template are provided. The system includes at least one tangible processor and a memory with instructions to be executed by the at least one tangible processor for processing a digital assessment template. The template which includes a description of a plurality of data structures that are configured for interpreting an assessment associated with the template. The assessment was marked with strokes by an assessment-taker who was administered the assessment and responded to at least one problem provided by the assessment. The template describes a location of the marked assessment in which to find each stroke that corresponds to a response by the assessment-taker and how to interpret the strokes. Each of the locations and how to interpret the strokes are selectable.
摘要:
A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.