摘要:
A method performed by a processing system is provided. The method comprises detecting an artifact in a first frame of a digital video using a plurality of edges identified in the first frame and replacing a region that encompasses the artifact in the first frame with a corresponding region from a second frame.
摘要:
Methods, machines, and computer-readable media storing machine-readable instructions for segmenting pixels in an image are described. In one aspect, a region of background pixels is identified in the image. At least some of the background pixels in the region are located on a boundary spatially delimiting the region. One or more orientation-dependent adaptive thresholds are determined for one or more respective candidate growth directions from a given background pixel located on the region boundary. Color distances between the given background pixel and candidate pixels in a neighborhood of the given background pixel are determined. The region is grown based on application of the one or more orientation-dependent adaptive thresholds to the determined color distances.
摘要:
Systems and methods according to the present invention provide techniques to automatically insert an object from one image into a region of another image. The systems and methods require little or no user interaction to allow efficient re-use and updating of existing images, presentations, documents and the like. An object and a container region are identified. Feasible placement location(s) within the container region for the object, as well as an associated scale factor, are determined. If multiple feasible placement locations are identified for a particular scale factor, then one is selected based upon predetermined criteria. The object can then be inserted into the container region and the resulting composite image stored or, alternatively, parameters can be stored which enable object insertion at a subsequent processing step.
摘要:
A method analyzes an image to be scanned and analyzes at least part of the image pixel-by-pixel. During or after a preview scan, a characteristic is assigned to a plurality of pixels in the image and pixels are grouped according to similar characteristics. A representation of a least one of the characteristics corresponding to a group of pixels is communicated to the scanner. For example, the pixels may be analyzed to determining if the pixel is black or white. The pixels may also be analyzed to determining if a pixel is on an edge between black and white. Black pixels that are adjacent each other can be grouped together, and white pixels that are adjacent each other can also be grouped together. A region of an image with a relatively high number of black and white groups can be characterized as black and white text only. That characterization can then be used to properly set a scanner, for example, without user intervention, so that the final scan of the image can be done at 300 dpi with a low bit depth.
摘要:
Segmenting a web page (110) into coherent function blocks (705-1 to 705-8) includes parsing content from the web page (110) into multiple coherent, collectively exhaustive nodes (405-1 to 405-37); calculating at least one matrix (500, 600, 605-1 to 605-4) of affinity values between each of the nodes (405-1 to 405-37); and clustering the nodes (405-1 to 405-37) into functional blocks (705-1 to 705-8) based on the affinity values in the at least one matrix (500, 600, 605-1 to 605-4).
摘要:
A system and method for selectively filtering web page contents are disclosed. In one example embodiment a document object model (DOM) structure and visual information of the web page contents are generated. The document object model (DOM) structure and the visual information are analyzed to determine multiple web page content attributes. One or more filtering parameters are selected from the multiple web page content attributes. The web page is filtered based on the one or more filtering parameters.
摘要:
A method for producing web page content includes identifying blocks within a web page. The blocks are selectively assembled into sections. The sections are selectively assembled into article candidates. An article candidate that includes article content is distinguished from article candidates that do not include article content. Content is produced only from the article candidate distinguished as including article content.
摘要:
A method and system for extracting Web content is disclosed. In one embodiment, Web content in a Webpage is extracted by identifying paragraphs in the Web content based on line-break node determination. A range of text-body associated with the identified paragraphs is then identified using a maximum scoring subsequence. Further, the identified text-body is refined using a heuristic rule of substantially horizontal alignment. Furthermore, one or more titles and one or more images associated with the Web content are extracted. Moreover, the Web content including the identified paragraphs, the one or more titles and the one or more images are outputted.
摘要:
A system and method are provided for segmenting text from a portable document format (PDF) document. The system includes a memory for storing computer executable instructions and a processing unit for accessing the memory and executing the computer executable instructions. The computer executable instructions include an engine to group line segments into text blocks using a homogeneity measure based on relative line space difference between line segments and a homogeneity measure based on difference in font size between line segments, where the line segments comprise text elements extracted from the PDF document.
摘要:
Systems and methods according to the present invention provide techniques to reliably detect edges, lines and quadrilaterals, especially those with low local contrast, in color images. Edges can be detected using a color gradient operator is based on color distance with a non-linear weight determined by the consistency of local gradient orientations, thereby significantly improving the signal/noise ratio. In detecting lines, a variant of the Gradient Weighted Hough Transform can be used employing both the edge strength and orientation. Multiple non-overlapping quadrilaterals can be detected using a process which includes quality metrics (for both individual quadrilaterals and for a set of non-overlapping quadrilaterals) and a graph-searching method.