-
公开(公告)号:US09753901B1
公开(公告)日:2017-09-05
申请号:US13890851
申请日:2013-05-09
Applicant: Google Inc.
Inventor: Yifan Xu , Xiaofeng Mi
CPC classification number: G06F17/2247 , G06F17/30477 , G06F17/30861 , G06F17/30864
Abstract: Systems and techniques are provided for detecting columns of an electronic page based on a render of the electronic page and identification of one or more columns based on the render. A column of interest may be identified based on detecting the one or more columns based on a physical position of the column, a column size, and/or the content within the column. The column of interest may be used to index or categorize the electronic page as the most relevant information corresponding to the page may be contained in the column of interest.
-
公开(公告)号:US20150213118A1
公开(公告)日:2015-07-30
申请号:US14168649
申请日:2014-01-30
Applicant: Google Inc.
CPC classification number: G06F17/2247 , G06F17/248 , G06F17/30864
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining summary content for resources in a domain. In one aspect, a method includes accessing a first resource belonging to a particular domain, selecting an anchor in the first resource linking to a second resource belonging to the particular domain, identifying particular text content in the first resource that is subordinate to the anchor that the second resource includes the particular text content that is subordinate to the anchor, based on determining that the second resource includes the particular text content that is subordinate to the anchor, generating a domain template for the particular domain, the domain template specifying a location of the particular text content in the second resource, and determining, for each respective resource belonging to the particular domain having a structure matching the domain template, respective text content for the respective resource.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于确定域中资源的概要内容。 在一个方面,一种方法包括访问属于特定域的第一资源,选择链接到属于特定域的第二资源的第一资源中的锚点,识别属于锚点的第一资源中的特定文本内容, 基于确定第二资源包括从属于锚的特定文本内容,为特定域生成域模板,所述第二资源包括从属于锚的特定文本内容,所述域模板指定 所述第二资源中的特定文本内容,并且对于属于具有与所述域模板匹配的结构的所述特定域的每个相应资源,确定相应资源的相应文本内容。
-