摘要:
The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify a set of graphemes that most likely correspond to each symbol image that occurs in the scanned document image. The graphemes are selected for a symbol image based on accumulated votes generated from symbol patterns identified as likely related to the symbol image using one or more decision forests.
摘要:
Methods and devices are described for detecting boundaries of documents on flatbed and multi-function scanners on a first pass of a carriage assembly, and then performing a high resolution scan on a second pass. High resolution images of documents can then be obtained with little or no interaction normally necessary to identify areas of interest on the scanner bed. Patterns on the scanner cover or lid facilitate not only edge determination, but orientation of text and other objects, and straightening of images in preparation for OCR and related functions. Electronic images and files derived from paper documents may be automatically cropped, deskewed, subjected to OCR, and named consistent with content or other information derived from them.
摘要:
Disclosed are implementations of methods and systems for displaying definitions and translations of words by searching for a translation simultaneously in various languages according to a query in a general language dictionary. The invention removes the need to specify a source language for the word or word combination when translated into a target language. The target language may be preset. Translation is possible for word combinations in multiple sources languages. Source words may be entered manually or captured by an imaging component of an electronic device. When captured, a word combination is selected, and subjected to optical character recognition (OCR) and translation. Source language and OCR language may be suggested via geolocation of the electronic device.
摘要:
An algorithm for assigning priorities to tasks queued for processing by users based on how heavily each task's user used the system resources in the past, including the number of tasks queued by the user in the past, the volume of these tasks, and the amount of processor time used. In the OCR context, the tasks are graphic files placed on servers and chosen for processing in accordance with the assigned priorities.
摘要:
A computer method and an electronic device enable a user to lookup words and insert new words in a text based on the results of the look up. The method executed by the device includes: providing a user with a capability to select at least one word in a text displayed on the screen of the device; performing a dictionary lookup of the identified word so as to determine translation alternatives of the identified word; displaying at least some of the translation alternatives; selecting one of the displayed alternatives; determining its word forms, wherein the word forms consist of gender, number, grammatical tense and grammatical variations of the same word; selecting one of the word forms; and inserting the selected word from in the text.
摘要:
Embodiments of the present invention disclose a dictionary lookup method and an electronic device that implements the dictionary lookup method. The dictionary lookup method allows a user to quickly obtain meanings and translations of words from electronic dictionaries while reading a text on a display screen of the electronic device, wherein reading text is utilized by performing an optical character recognition comprising of determining a set of base forms of each inflected recognized word. Advantageously, in one embodiment the meanings (e.g., the base forms) and translations may be displayed in a balloon, in a pop-up window, as subscript, as superscript, or in any other suitable manner when the user touches a word on the display screen, in one embodiment.
摘要:
A method for detecting a junction in a received image of the line of text to update a junction list with descriptive data is provided. The method includes creating a color histogram based on a number of color pixels in the received image of the line of text and detecting, based at least in part on the received image of the line of text, a rung within the received image of the line of text. The method also includes identifying a horizontal position of the detected rung in the received image of the line of text and identifying a gateway on the color histogram, wherein the identified gateway is associated with the detected rung. The junction list is updated with data including a description of the identified gateway.
摘要:
Detecting blur and defocusing in images is described. After detection, correction algorithms are applied. Detection provides an image processing system with parameters related to a blur (e.g., direction, strength) and noise levels, or may trigger a message to a user to re-take a photograph. Detection involves finding and analyzing edges of objects instead of an entire image. Disclosed detector may be used for OCR purposes, blur and defocusing detection in photographic and scanning devices, video cameras, print quality control systems, computer vision. Detection of blur and defocusing of an image involve second derivatives of image brightness. Object edges are detected. For points on edges, profiles of second derivative are obtained in the direction of the gradient. Statistics are gathered about parameters of profiles in various directions. By analyzing statistics, image distortions and their type (e.g., blur, defocusing), the strength of distortion, the direction of the blur are detected.
摘要:
Embodiments of the present invention disclose a copying method that combines optical character recognition (OCR) technology and a search in order to improve the quality of a copy despite the presence of degrading factors. In one embodiment, the search comprises an Internet search and is used to reconstruct/enhance the copy digitally before outputting the copy to print or some other digital medium. Advantageously, a copy produced using the techniques of the present invention may be at least equal to if not better than the original document copied.
摘要:
A method for detecting a junction in a received image of the line of text to update a junction list with descriptive data is provided. The method includes creating a color histogram based on a number of color pixels in the received image of the line of text and detecting, based at least in part on the received image of the line of text, a rung within the received image of the line of text. The method also includes identifying a horizontal position of the detected rung in the received image of the line of text and identifying a gateway on the color histogram, wherein the identified gateway is associated with the detected rung. The junction list is updated with data including a description of the identified gateway.