Abstract:
Disclosed is a method of blending stitched document image portions. The method identifies background pixels and foreground pixels on each boundary of the image portions. Pixels of the image portions are then modified based on a pixel value difference between corresponding background pixels on the respective boundary of the first and second portions.
Abstract:
A system and method of selecting content within a web page (110, 300) may include, with a processor (125), determining spatial coordinates of a plurality of nodes (210 through 285) within the web page (110, 300), recording coordinates of a drawn portion (610) of the web page (110, 300), and determining, with the processor (125), a number of corresponding regions (710, 910) for the drawn portion (610) of the web page (110, 300) based on the spatial coordinates of the nodes (210 through 285).
Abstract:
A system and method for selecting main content (350) from web pages includes receiving a web page (205) by a web page analysis device (105) and scoring sub-trees (209) within the web page (205). The single sub-tree (225) with the highest final score is selected as the main content (350) of the webpage (205).
Abstract:
A method for embedding, using a processor, digital watermark data into image data representing a number of pixels, each of which has respective saturation values is disclosed. The method comprises the following steps: a) using said processor, dividing the image into blocks of pixels of a predefined size; b) for each block, using said processor to select one of a plurality of saturation patterns representing the binary value of one or more bits of the digital watermark data corresponding to the block; and c) for each block, using said processor to embed the binary value of the one or more bits of corresponding digital watermark data into the block by adjusting the saturation of pixels within each block in accordance with the selected pattern.
Abstract:
A system and method for selectively filtering web page contents are disclosed. In one example embodiment a document object model (DOM) structure and visual information of the web page contents are generated. The document object model (DOM) structure and the visual information are analyzed to determine multiple web page content attributes. One or more filtering parameters are selected from the multiple web page content attributes. The web page is filtered based on the one or more filtering parameters.
Abstract:
A system and method are provided for extracting main content from a web page. Web page segmentation is performed on a web page to provide affinity-grouped segments. Descriptive features of at least one of the affinity-grouped segments are computed. At least one of the affinity-grouped segments is classified as a main body segment based on the computed descriptive features. Additional affinity-grouped segments are classified as to a document function based on the computed descriptive features. Classified affinity-grouped segments are assembled according to their classified document functions to provide the main content.
Abstract:
A method for selecting user desirable content from web pages includes receiving a web page, representing the web page as a Document Object Module (DOM) tree, computing visual and coordinate information of each Document Object Module (DOM) node within the Document Object Module (DOM) tree, determining the desirable Document Object Module (DOM) path, determining the desirable Document Object Module (DOM) node from the desirable Document Object Module (DOM) path, and selecting a single Document Object Module (DOM) node with the highest final score. The single Document Object Module (DOM) node with the highest final score is selected as the user desirable content of the webpage.
Abstract:
An exemplary embodiment of the present may generate a DOM-tree and generate a signal based on the DOM-tree and a node list. The signal may be analyzed and nodes may be selected within the signal to form a periodic wave. Repeat patterns may be detected using the periodic wave and the nodes.
Abstract:
Proposed is the use of an email-stamp for representing an email address. By comprising information about one or more email addresses of a recipient, an email stamp may be processed in accordance with an optical recognition process so as to identify the email address of the recipient and enable an email to be automatically sent to the recipient.
Abstract:
Disclosed is a method of blending stitched document image portions. The method identifies background pixels and foreground pixels on each boundary of the image portions. Pixels of the image portions are then modified based on a pixel value difference between corresponding background pixels on the respective boundary of the first and second portions.