Extraction of Content from a Web Page
    1.
    发明申请
    Extraction of Content from a Web Page 审中-公开
    从网页提取内容

    公开(公告)号:US20130283148A1

    公开(公告)日:2013-10-24

    申请号:US13817656

    申请日:2010-10-26

    IPC分类号: G06F17/22

    CPC分类号: G06F17/2247 G06F16/986

    摘要: A system and method are provided for extracting main content from a web page. Web page segmentation is performed on a web page to provide affinity-grouped segments. Descriptive features of at least one of the affinity-grouped segments are computed. At least one of the affinity-grouped segments is classified as a main body segment based on the computed descriptive features. Additional affinity-grouped segments are classified as to a document function based on the computed descriptive features. Classified affinity-grouped segments are assembled according to their classified document functions to provide the main content.

    摘要翻译: 提供了一种用于从网页提取主要内容的系统和方法。 在网页上执行网页分割以提供关联分组的段。 计算至少一个亲和力分组段的描述性特征。 基于所计算的描述特征,至少一个亲和度分组的段被分类为主体段。 基于所计算的描述特征,附加的亲和组合段被分类为文档功能。 分类的亲和度分组段根据其分类的文档功能进行组装以提供主要内容。

    Selective content extraction
    2.
    发明授权
    Selective content extraction 有权
    选择性内容提取

    公开(公告)号:US09032285B2

    公开(公告)日:2015-05-12

    申请号:US13378153

    申请日:2009-06-30

    IPC分类号: G06F17/00 G06F17/30

    CPC分类号: G06F17/30905

    摘要: A method for extracting web content includes detecting, within a web page, a hierarchical structure that includes a plurality of nodes. Potential article nodes from the plurality of nodes are identified. The identified potential article node with a highest rank in the hierarchical structure is identified as an article node. Content is extracted from the article node.

    摘要翻译: 一种用于提取web内容的方法包括在网页内检测包括多个节点的分层结构。 识别来自多个节点的潜在物品节点。 在层次结构中具有最高等级的所识别的潜在文章节点被标识为文章节点。 从文章节点中提取内容。

    SELECTIVE CONTENT EXTRACTION
    3.
    发明申请
    SELECTIVE CONTENT EXTRACTION 有权
    选择性内容提取

    公开(公告)号:US20120089903A1

    公开(公告)日:2012-04-12

    申请号:US13378153

    申请日:2009-06-30

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30905

    摘要: A method for extracting web content includes detecting, within a web page, a hierarchical structure that includes a plurality of nodes. Potential article nodes from the plurality of nodes are identified. The identified potential article node with a highest rank in the hierarchical structure is identified as an article node. Content is extracted from the article node.

    摘要翻译: 一种用于提取web内容的方法包括在网页内检测包括多个节点的分层结构。 识别来自多个节点的潜在物品节点。 在层次结构中具有最高等级的所识别的潜在文章节点被标识为文章节点。 从文章节点中提取内容。

    Producing marketing items for a marketing campaign
    4.
    发明申请
    Producing marketing items for a marketing campaign 有权
    为营销活动制作营销项目

    公开(公告)号:US20070022003A1

    公开(公告)日:2007-01-25

    申请号:US11184098

    申请日:2005-07-19

    IPC分类号: G06Q30/00

    摘要: Methods, machines, systems and machine-readable instructions for producing marketing items are described. In one aspect, a user is prompted to specify campaign parameters, including one or more campaign topics, defining a scope of the campaign. The user is prompted to specify for each of the one or more campaign topics a corresponding set of one or more attributes of intended recipients of the marketing campaign. The one or more specified campaign topics are associated to respective sets of targeted recipients selected from a database of records of potential recipients based on mappings of the respective sets of recipient attributes to the campaign topics and the specified campaign parameters defining the scope of the marketing campaign. For each of the targeted recipients, a respective marketing item containing a respective set of one or more contents matched to the campaign topic associated to the targeted recipient is composed.

    摘要翻译: 描述了用于生产营销项目的方法,机器,系统和机器可读指令。 在一个方面,提示用户指定广告系列参数,包括一个或多个活动主题,定义广告系列的范围。 提示用户为一个或多个活动主题中的每一个指定营销活动的预期接收者的一个或多个属性的对应集合。 一个或多个指定的活动主题与从潜在接收者的记录的数据库中选择的各组目标接收者相关联,所述目标接收者基于针对该活动主题的各个接收方属性集合的映射以及指定的营销活动范围的活动参数 。 对于每个目标收件人,组成包含与与目标接收者相关联的活动主题匹配的一个或多个内容的相应集合的相应营销项目。