Clustering Web Pages on a Search Engine Results Page
    1.
    发明申请
    Clustering Web Pages on a Search Engine Results Page 有权
    搜索引擎结果页面上的聚合网页

    公开(公告)号:US20130041877A1

    公开(公告)日:2013-02-14

    申请号:US13205809

    申请日:2011-08-09

    IPC分类号: G06F17/30

    摘要: Methods, systems, and media are provided for delivering clustered search results for recent and non-recent events by maintaining the identification (ID) numbers of the respective clustered documents beyond the “fresh” life span of the clustered documents. When clusters are formed according to similar content, an ID number and associated attributes are assigned to each of the clusters. This provides a mechanism to track and retrieve the respective clusters for subsequent delivery of search results. The respective ID numbers of the clusters are maintained, even after the documents are no longer considered “fresh.” These similar-content clusters are further subdivided according to publication date. This provides individual subdivided clusters for similar content events that occurred at different time spans, which are delivered along with individual non-clustered search results in a SERP.

    摘要翻译: 提供了方法,系统和媒体,用于通过将各个聚类文档的标识(ID)编号超过聚集文档的新鲜寿命来提供最近和非近期事件的集群搜索结果。 当根据类似内容形成簇时,将ID号和相关属性分配给每个簇。 这提供了跟踪和检索相应集群以用于随后传送搜索结果的机制。 即使文件不再被认为是新鲜的,集群的相应ID号仍然保持。 这些类似内容的集群根据发布日期进一步细分。 这提供了用于在不同时间跨度发生的类似内容事件的单独的细分簇,它们与SERP中的各个非聚集搜索结果一起传送。

    ENHANCING FRESHNESS OF SEARCH RESULTS
    2.
    发明申请
    ENHANCING FRESHNESS OF SEARCH RESULTS 有权
    增强搜索结果的清晰度

    公开(公告)号:US20110295844A1

    公开(公告)日:2011-12-01

    申请号:US12789020

    申请日:2010-05-27

    IPC分类号: G06F17/30

    摘要: Methods, systems, and computer-storage media for improving the freshness, or the apparent freshness, of search results are described. In an embodiment, the first portion of search results presented on a search results page are based on responsiveness to the search query and a second portion of results describe only recently published documents that are responsive to the search query. In an embodiment, a more recent version of the document, which is not directly used to determine responsiveness, is used to build the caption for a search result. Another way to make search results appear fresh is to include a publication time within the search result caption. In one embodiment, the publication time is generated by calculating a point in time between when a document is first added to a search index and the previous time the search engine visited the site where the document was found.

    摘要翻译: 描述了用于提高搜索结果的新鲜度或表观新鲜度的方法,系统和计算机存储介质。 在一个实施例中,在搜索结果页面上呈现的搜索结果的第一部分基于对搜索查询的响应,并且结果的第二部分仅描述响应于搜索查询的最近发布的文档。 在一个实施例中,使用不直接用于确定响应性的文档的更新版本来构建搜索结果的标题。 使搜索结果显示新鲜的另一种方法是在搜索结果标题中包含发布时间。 在一个实施例中,发布时间是通过计算文档首次添加到搜索索引的时间点与搜索引擎访问该文档所在的站点的之前的时间点来生成的。

    UPDATING A SEARCH INDEX USING REPORTED BROWSER HISTORY DATA
    3.
    发明申请
    UPDATING A SEARCH INDEX USING REPORTED BROWSER HISTORY DATA 有权
    使用报告的浏览器历史数据更新搜索索引

    公开(公告)号:US20120150831A1

    公开(公告)日:2012-06-14

    申请号:US12964092

    申请日:2010-12-09

    IPC分类号: G06F17/30

    摘要: Methods, systems, and computer-readable media are provided for updating a search index with new uniform resource locators (URLs) and spiking URLs with increased user interest. History data, provided from browser applications residing on users' computers that indicate URLs accessed by the users, is parsed to identify new/previously unknown URLs. The history data also indicates URLs in which there is increased interest based on a number of recent hits as compared to an average number of hits determined over time. Author postings of new URLs to social networking sites and a quality rating of the authors may also be used to identify and filter new URLs. Search indexes are updated with the new and spiking URL data. As such, lag time between posting of new URLs and spiking of URL interest and inclusion of this data in a search index is greatly decreased.

    摘要翻译: 提供了方法,系统和计算机可读介质,用于使用新的统一资源定位符(URL)更新搜索索引和增加用户兴趣的加标URL。 由驻留在用户计算机上的用于指示用户访问的URL的浏览器应用程序提供的历史数据将被解析,以识别新的/以前未知的URL。 与根据随时间确定的平均击球次数相比,历史数据还指示基于最近命中数的兴趣增加的URL。 社交网站的新URL的作者发布和作者的品质评级也可用于识别和过滤新的URL。 搜索索引将使用新的和加标的URL数据进行更新。 因此,新的URL发布之间的滞后时间和URL兴趣的尖峰以及将这些数据包含在搜索索引中的时间大大降低。

    CONTENT SIGNATURE NOTIFICATION
    4.
    发明申请
    CONTENT SIGNATURE NOTIFICATION 有权
    内容签名通知

    公开(公告)号:US20120047121A1

    公开(公告)日:2012-02-23

    申请号:US12861788

    申请日:2010-08-23

    IPC分类号: G06F17/30

    摘要: A client application installed on end user computers generates metadata from the content of web pages visited by end users and provides the metadata to a search engine. When an end user visits a web page, the end user's computer downloads and displays the web page to the end user. The client application may simultaneously access the web page content and generate this metadata in the form of a content signature of the web page from the web page content. The client application then provides the content signature to a search engine. The search engine may employ content signatures to identify new web pages to crawl and index. Additionally, the search engine may employ content signatures to identify changes to web pages and determine the crawl frequency of web pages.

    摘要翻译: 安装在最终用户计算机上的客户端应用程序从最终用户访问的网页的内容生成元数据,并将元数据提供给搜索引擎。 当最终用户访问网页时,最终用户的计算机下载并将该网页显示给最终用户。 客户端应用程序可以同时访问网页内容,并从网页内容以网页的内容签名的形式生成该元数据。 然后,客户应用程序将内容签名提供给搜索引擎。 搜索引擎可以使用内容签名来识别新的网页来爬行和索引。 此外,搜索引擎可以使用内容签名来识别网页的改变并确定网页的爬行频率。