PROVIDING A PARTICULAR TYPE OF UNIFORM RESOURCE LOCATOR
    1.
    发明申请
    PROVIDING A PARTICULAR TYPE OF UNIFORM RESOURCE LOCATOR 审中-公开
    提供特殊类型的统一资源定位器

    公开(公告)号:US20120246552A1

    公开(公告)日:2012-09-27

    申请号:US13052622

    申请日:2011-03-21

    IPC分类号: G06F17/00

    CPC分类号: G06F16/951

    摘要: Examples disclosed herein are example systems and methods to provide a particular type of uniform resource locator. In one example, a processor identifies webpage source code associated with a list of text associated with the type of uniform resource locator. The processor may identify a uniform resource locator within the identified webpage source code and provide the uniform resource locator.

    摘要翻译: 本文公开的示例是提供特定类型的统一资源定位符的示例系统和方法。 在一个示例中,处理器识别与与统一资源定位符的类型相关联的文本列表相关联的网页源代码。 处理器可以识别所识别的网页源代码内的统一资源定位符,并提供统一的资源定位符。

    Content grouping systems and methods
    6.
    发明授权
    Content grouping systems and methods 失效
    内容分组系统和方法

    公开(公告)号:US08577887B2

    公开(公告)日:2013-11-05

    申请号:US12639768

    申请日:2009-12-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30911

    摘要: A method of grouping a plurality of media content is provided. The method includes converting at least a portion of the media content into at least one document object model (“DOM”) using a processor. The DOM can include a plurality of block elements, each comprising at least one content object. The method includes apportioning the content objects into a relevant portion and an irrelevant portion and extracting a set of keywords, the set comprising at least one keyword, within the relevant portion of the content objects. The method includes apportioning the relevant portion of the content objects into a related portion and an unrelated portion using at least a portion of the set of keywords and grouping the related portion of the content to provide a group of related content.

    摘要翻译: 提供了一种分组多个媒体内容的方法。 该方法包括使用处理器将媒体内容的至少一部分转换成至少一个文档对象模型(“DOM”)。 DOM可以包括多个块元素,每个块元素包括至少一个内容对象。 该方法包括将内容对象分配到相关部分和不相关部分中,并且在内容对象的相关部分内提取一组关键字,该集合包括至少一个关键字。 该方法包括使用该组关键字的至少一部分将内容对象的相关部分分配到相关部分和不相关部分中,并且对内容的相关部分进行分组以提供一组相关内容。

    CONTENT GROUPING SYSTEMS AND METHODS
    7.
    发明申请
    CONTENT GROUPING SYSTEMS AND METHODS 失效
    内容分组系统和方法

    公开(公告)号:US20110145249A1

    公开(公告)日:2011-06-16

    申请号:US12639768

    申请日:2009-12-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30911

    摘要: A method of grouping a plurality of media content is provided. The method includes converting at least a portion of the media content into at least one document object model (“DOM”) using a processor. The DOM can include a plurality of block elements, each comprising at least one content object. The method includes apportioning the content objects into a relevant portion and an irrelevant portion and extracting a set of keywords, the set comprising at least one keyword, within the relevant portion of the content objects. The method includes apportioning the relevant portion of the content objects into a related portion and an unrelated portion using at least a portion of the set of keywords and grouping the related portion of the content to provide a group of related content.

    摘要翻译: 提供了一种分组多个媒体内容的方法。 该方法包括使用处理器将媒体内容的至少一部分转换成至少一个文档对象模型(“DOM”)。 DOM可以包括多个块元素,每个块元素包括至少一个内容对象。 该方法包括将内容对象分配到相关部分和不相关部分中,并且在内容对象的相关部分内提取一组关键字,该集合包括至少一个关键字。 该方法包括使用该组关键字的至少一部分将内容对象的相关部分分配到相关部分和不相关部分中,并且对内容的相关部分进行分组以提供一组相关内容。

    System and Method for Web Content Extraction
    8.
    发明申请
    System and Method for Web Content Extraction 有权
    Web内容提取的系统和方法

    公开(公告)号:US20120303636A1

    公开(公告)日:2012-11-29

    申请号:US13258482

    申请日:2009-12-14

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30896 G06F3/1246

    摘要: A method and system for extracting Web content is disclosed. In one embodiment, Web content in a Webpage is extracted by identifying paragraphs in the Web content based on line-break node determination. A range of text-body associated with the identified paragraphs is then identified using a maximum scoring subsequence. Further, the identified text-body is refined using a heuristic rule of substantially horizontal alignment. Furthermore, one or more titles and one or more images associated with the Web content are extracted. Moreover, the Web content including the identified paragraphs, the one or more titles and the one or more images are outputted.

    摘要翻译: 公开了一种用于提取Web内容的方法和系统。 在一个实施例中,通过基于线间歇节点确定来识别Web内容中的段落来提取网页中的Web内容。 然后使用最大记分子序列来识别与识别的段落相关联的文本体的范围。 此外,使用基本上水平对齐的启发式规则来改进所识别的文本体。 此外,提取与Web内容相关联的一个或多个标题和一个或多个图像。 此外,输出包括识别的段落的Web内容,一个或多个标题和一个或多个图像。

    System and method for web content extraction
    9.
    发明授权
    System and method for web content extraction 有权
    网页内容提取的系统和方法

    公开(公告)号:US08819028B2

    公开(公告)日:2014-08-26

    申请号:US13258482

    申请日:2009-12-14

    IPC分类号: G06F17/30 G06F3/12

    CPC分类号: G06F17/30896 G06F3/1246

    摘要: A method and system for extracting Web content is disclosed. In one embodiment, Web content in a Webpage is extracted by identifying paragraphs in the Web content based on line-break node determination. A range of text-body associated with the identified paragraphs is then identified using a maximum scoring subsequence. Further, the identified text-body is refined using a heuristic rule of substantially horizontal alignment. Furthermore, one or more titles and one or more images associated with the Web content are extracted. Moreover, the Web content including the identified paragraphs, the one or more titles and the one or more images are outputted.

    摘要翻译: 公开了一种用于提取Web内容的方法和系统。 在一个实施例中,通过基于线间歇节点确定来识别Web内容中的段落来提取网页中的Web内容。 然后使用最大记分子序列来识别与识别的段落相关联的文本体的范围。 此外,使用基本上水平对齐的启发式规则来改进所识别的文本体。 此外,提取与Web内容相关联的一个或多个标题和一个或多个图像。 此外,输出包括识别的段落的Web内容,一个或多个标题和一个或多个图像。

    SYSTEMS AND METHODS FOR ADDING COMMERCIAL CONTENT TO PRINTOUTS
    10.
    发明申请
    SYSTEMS AND METHODS FOR ADDING COMMERCIAL CONTENT TO PRINTOUTS 审中-公开
    将商业内容添加到打印机的系统和方法

    公开(公告)号:US20150138605A1

    公开(公告)日:2015-05-21

    申请号:US13821356

    申请日:2010-09-21

    IPC分类号: G06Q30/02 G06F3/12 G06K15/02

    摘要: Systems, devices and methods are provided which relate to detecting a print command on a client computer, the print command reflecting an interest to print content of an electronic document, accessible by a client computer, as a hard copy printout. One method includes analyzing the electronic document content to determine its underlying subject matter, identifying commercial content relevant to the underlying subject matter, and creating and formatting a new, printable document that includes the electronic document content and the identified commercial content.

    摘要翻译: 提供了与检测客户端计算机上的打印命令相关的系统,设备和方法,该打印命令反映了将由客户端计算机访问的电子文档的内容打印出来的兴趣,作为硬拷贝打印输出。 一种方法包括分析电子文档内容以确定其基本主题,识别与底层主题相关的商业内容,以及创建和格式化包括电子文档内容和所识别的商业内容的新的可打印文档。