摘要:
A method for detecting a text region in an image is disclosed. The method includes detecting a candidate text region from an input image. A set of oriented gradient images is generated from the candidate text region, and one or more detection window images of the candidate text region are captured. A sum of oriented gradients is then calculated for a region in one of the oriented gradient images. It is classified whether each detection window image contains text by comparing the associated sum of oriented gradients and a threshold. Based on the classifications of the detection window images, it is determined whether the candidate text region is a true text region.
摘要:
A method for detecting a text region in an image is disclosed. The method includes detecting a candidate text region from an input image. A set of oriented gradient images is generated from the candidate text region, and one or more detection window images of the candidate text region are captured. A sum of oriented gradients is then calculated for a region in one of the oriented gradient images. It is classified whether each detection window image contains text by comparing the associated sum of oriented gradients and a threshold. Based on the classifications of the detection window images, it is determined whether the candidate text region is a true text region.
摘要:
A method includes receiving an indication of a set of image regions identified in image data. The method further includes, selecting image regions from the set of image regions for text extraction at least partially based on image region stability.
摘要:
A method of scanning an image of a document with a portable electronic device includes interactively indicating in substantially real time on a user interface of the portable electronic device, an instruction for capturing at least one portion of an image to enhance quality. The indication is in response to identifying degradation associated with the portion(s) of the image. The method also includes capturing the portion(s) of the image with the portable electronic device according to the instruction. The method further includes stitching the captured portion(s) of the image in place of a degraded portion of a reference image corresponding to the document, to create a corrected stitched image of the document.
摘要:
A method includes tracking an object in each of a plurality of frames of video data to generate a tracking result. The method also includes performing object processing of a subset of frames of the plurality of frames selected according to a multi-frame latency of an object detector or an object recognizer. The method includes combining the tracking result with an output of the object processing to produce a combined output.
摘要:
Methods, apparatuses, systems, and computer-readable media for rejecting false positive detection and tracking of image objects are presented. According to one or more aspects, a computing device may implement embodiments of the invention to use the movement of the mobile device for distinguishing false positives from true movement of the mobile device depicted in the field of view of the camera. In one embodiment, the actual movement of the mobile device may be measured using multi-modal sensor data from inertial sensors such as accelerometers and gyroscopes. In another embodiment, the actual movement of the device is calculated using the global movement of the mobile phone with reference to other objects in the field of view of the camera.
摘要:
A method for taking a panorama mosaic photograph includes displaying a partial image of a previously taken image as a guide image on a viewer of an image to be currently taken and taking a number of images constituting the panorama mosaic photograph according to a photography operation; projecting the taken images onto a common cylindrically curved surface; and joining the projected images into a single image.
摘要:
Provided is an apparatus and method for separating music and voice using an independent component analysis method for a two-dimensional forward network. The apparatus of separating music and voice can separate voice signal and a music signal, each of which are independently recorded, from a mixed signal, in a short convergence time by using the independent component analysis method, which estimates a signal mixing process according to a difference in record positions of sensors. Thus, users can easily select accompaniment from their own compact discs(CDs), digital video discs(DVDs), or audio cassette tapes, or FM radio, and listen to music of improved quality in real time. Accordingly, the users can just enjoy the music or sing along. Furthermore, since the independent component analysis method in the apparatus of separating music and voice is simple and time taken to perform the method is not long, the method can be easily used in a digital signal processor (DSP) chip, a microprocessor, or the like.
摘要:
A method for processing a multi-channel image is disclosed. The method includes generating a plurality of grayscale images from the multi-channel image. At least one text region is identified in the plurality of grayscale images and text region information is determined from the at least one text region. The method generates text information of the multi-channel image based on the text region information. If the at least one text region includes a plurality of text regions, text region information from the plurality of text regions is merged to generate the text information. The plurality of the grayscale images is processed in parallel. In identifying the at least one text region, at least one candidate text region may be identified in the plurality of grayscale images and the at least one text region may be identified in the identified candidate text region.
摘要:
Techniques are described for identifying blurred images and recognizing text. One or more images of text may be captured. A change of movement associated with each image of the one or more images may be calculated. The change of movement associated with an image of the one or more images represents a change in an amount of acceleration of the device used to capture the image while the image was being captured. A steady image may be selected from the one or more images to use for text recognition. The steady image can be selected using the variances of acceleration associated with each image of the one or more images.