摘要:
Semantically ranking content in a website (110) with a computerized ranking device (105) includes: parsing content from the website (110) into multiple autonomous content blocks (415-1 to 415-17) with the computerized ranking device (105) and assigning an importance ranking with said computerized ranking device (105) to each of the content blocks (415-1 to 415-17) based on a degree to which a substance of the content block (415-1 to 415-17) is relevant to one of a plurality of predefined categories.
摘要:
Semantically ranking content in a website (110) with a computerized ranking device (105) includes: parsing content from the website (110) into multiple autonomous content blocks (415-1 to 415-17) with the computerized ranking device (105) and assigning an importance ranking with said computerized ranking device (105) to each of the content blocks (415-1 to 415-17) based on a degree to which a substance of the content block (415-1 to 415-17) is relevant to one of a plurality of predefined categories.
摘要:
A method for producing web page content includes identifying blocks within a web page. The blocks are selectively assembled into sections. The sections are selectively assembled into article candidates. An article candidate that includes article content is distinguished from article candidates that do not include article content. Content is produced only from the article candidate distinguished as including article content.
摘要:
A method for producing web page content includes identifying blocks within a web page. The blocks are selectively assembled into sections. The sections are selectively assembled into article candidates. An article candidate that includes article content is distinguished from article candidates that do not include article content. Content is produced only from the article candidate distinguished as including article content.
摘要:
A computer-implemented method for obtaining the rendering co-ordinates of visible text elements on a web page is disclosed. The web page is represented by an input data structure comprising a plurality of text nodes, each of which represents a text element on the web page. The method comprises the following steps: a) using a computer device, wrapping each of the plurality of text nodes in a pair of mark-up language tags; b) using said computer device, obtaining the co-ordinates of a bounding rectangle for each text node using the mark-up language tags; c) using said computer device, attaching an attribute specifying the co-ordinates of the bounding rectangle to each text node; and d) using said computer device, determining whether each text node is invisible, and if it is, excluding it from an output data structure comprising the plurality of text nodes and attached attributes.
摘要:
A method of grouping a plurality of media content is provided. The method includes converting at least a portion of the media content into at least one document object model (“DOM”) using a processor. The DOM can include a plurality of block elements, each comprising at least one content object. The method includes apportioning the content objects into a relevant portion and an irrelevant portion and extracting a set of keywords, the set comprising at least one keyword, within the relevant portion of the content objects. The method includes apportioning the relevant portion of the content objects into a related portion and an unrelated portion using at least a portion of the set of keywords and grouping the related portion of the content to provide a group of related content.
摘要:
A method of grouping a plurality of media content is provided. The method includes converting at least a portion of the media content into at least one document object model (“DOM”) using a processor. The DOM can include a plurality of block elements, each comprising at least one content object. The method includes apportioning the content objects into a relevant portion and an irrelevant portion and extracting a set of keywords, the set comprising at least one keyword, within the relevant portion of the content objects. The method includes apportioning the relevant portion of the content objects into a related portion and an unrelated portion using at least a portion of the set of keywords and grouping the related portion of the content to provide a group of related content.
摘要:
The present disclosure generally relates to systems and methods for image data processing. In certain embodiments, an image processing pipeline may collect statistics associated with fixed pattern noise of image data by receiving a first frame of the image data comprising a plurality of pixels. The image processing pipeline may then determine a sum of a first plurality of pixel values that correspond to at least a first portion of the plurality of pixels such that each pixel in at least the first portion of the plurality of pixels is disposed along a first axis within the frame of the image data. After determining the sum of the first plurality of pixel values, the image processing pipeline may store the sum of the first plurality of pixel values in a memory such that the sum of the first plurality of pixel values represent the statistics.
摘要:
A system and a method for near-duplicate image detection performed by a physical computing system includes applying a feature determining function to a number of images, a feature being defined by a geometric shape, comparing characteristics of said geometric shapes defining said features from at least two of said number of images, and characterizing said at least two of said number of images as a near-duplicate match if a predetermined percentage of said features of said at least two images match.
摘要:
Systems and methods for processing raw image data are provided. One example of such a system may include memory to store image data in raw format from a digital imaging device and an image signal processor to process the image data. The image signal processor may include data conversion logic and a raw image processing pipeline. The data conversion logic may convert the image data into a signed format to preserve negative noise from the digital imaging device. The raw image processing pipeline may at least partly process the image data in the signed format. The raw image processing pipeline may also include, among other things, black level compensation logic, fixed pattern noise reduction logic, temporal filtering logic, defective pixel correction logic, spatial noise filtering logic, lens shading correction logic, and highlight recovery logic.