摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes identifying a plurality of registered publishers for enriched search results and, for each registered publisher, obtaining enrichment information from the registered publisher and associating the enrichment information with a resource provided by the publisher. A query is received. A plurality of responsive resources that are responsive to the query are identified. A first responsive resource is determined to be associated with enrichment information. An enriched search result is provided, the enriched search result identifying the first responsive resource and including the first responsive resource's associated enrichment information.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes identifying a plurality of registered publishers for enriched search results and, for each registered publisher, obtaining enrichment information from the registered publisher and associating the enrichment information with a resource provided by the publisher. A query is received. A plurality of responsive resources that are responsive to the query are identified. A first responsive resource is determined to be associated with enrichment information. An enriched search result is provided, the enriched search result identifying the first responsive resource and including the first responsive resource's associated enrichment information.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes receiving a query. A plurality of search results responsive to the query are identified. The search results are analyzed to determine that at least a first search result is associated with a first answer box topic. The search results are provided along with an answer box precursor for the first answer box topic.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes receiving a query. A plurality of search results responsive to the query are identified. The search results are analyzed to determine that at least a first search result is associated with a first answer box topic. The search results are provided along with an answer box precursor for the first answer box topic.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating additional content. In one aspect, a method includes identifying one or more central entities, wherein each central entity represents a topic of a first resource being presented in a user interface; generating one or more search queries, each of the one or more search queries being derived from one or more of the central entities; obtaining search results for the one or more search queries from a search engine; selecting resources relevant to the first resource from resources referenced by the obtained search results; generating additional content for presentation in a user interface element of the user interface based on the selected resources; and categorizing the generated additional content into a plurality of categories, wherein each category of additional content is displayed in a separate portion of the user interface element.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying central entities. In one aspect, a method includes obtaining candidate entities for a first resource; filtering a first entity graph whose nodes represent different entities found in a plurality of resources to remove nodes that do not correspond to a candidate entity, wherein pairs of nodes in the filtered first entity graph that are connected by an edge correspond to pairs of candidate entities that are associated with the same resource; generating a second entity graph for the first resource from the filtered first entity graph, wherein the second entity graph does not include nodes from the filtered first entity graph that are not connected to other nodes in the filtered first graph; and identifying candidate entities that are represented by nodes in the second entity graph as being central entities for the first resource.
摘要:
Systems and methods are herein disclosed for assessing the staleness of a web page. In particular, in one method of the present invention, the staleness of a web page is assessed by examining internal date references within the web page. In another method of the present invention, the staleness of a web page is assessed by examining the meta-data associated with the web page. In a further method of the present invention, the staleness of a hyperlinked web page is determined by examining the link status of the hyperlinks. If the web page has a relatively large number of dead links, it is assessed as being a stale web page. In a still further method of the present invention, the link status of web pages in the neighborhood of the web page being assessed is likewise examined.
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying topical entities. In one aspect, a method includes obtaining a plurality of entities that are associated with a first resource; for one or more of the identified entities, receiving search results for a search query derived from the entity; determining that search results for a search query including a particular entity include a specific type of search results; and determining that the particular entity is a topical entity of the first resource based at least in part on the particular entity appearing in a title or a resource locator of the first resource, wherein the topical entity of the first resource represents a predominant topic of the first resource.
摘要:
A signal-bearing medium is disclosed that includes operations including establishing a link threshold, wherein a web page will be assessed as lacking currency if a percentage of hyperlinks contained in the web page that link to an active page is less than the link threshold, accessing a web page containing hyperlinks, and testing the hyperlinks. Testing includes: selecting a hyperlink; and monitoring a number of redirects encountered by following the selected hyperlink until a final web page is reached or a failure occurs, and assessing the selected hyperlink as linking to a dead web page if a redirect limit is exceeded, the redirect limit greater than one, wherein exceeding the redirect limit causes occurrence of a failure. The operations also include calculating a percentage of hyperlinks that return active web pages, and comparing the percentage of hyperlinks that return active web pages with the link threshold.
摘要:
A focused random walk system produces samples of on-topic pages from a collection of hyper-linked pages such as Web pages. The focused random walk system utilizes a focused random walk to produce a focused sample, which is a random sample of Web pages focused on a topic. The focused random walk system uniformly samples pages iteratively, where each iteration follows a random link from a union of the in-links and out-links of a page. The system then classifies this randomly selected link to determine whether the page is on-topic. The random walk sampling process could comprise a hard-focus method that selects only on-topic pages at each step of the focused random walk, or a soft-focus method that allows limited divergence to off-topic pages.