Page selection for indexing
    1.
    发明授权
    Page selection for indexing 有权
    索引的页面选择

    公开(公告)号:US08645288B2

    公开(公告)日:2014-02-04

    申请号:US12959060

    申请日:2010-12-02

    IPC分类号: G06F15/18

    CPC分类号: G06F17/30873 G06F17/30867

    摘要: Some implementations provide techniques for selecting web pages for inclusion in an index. For example, some implementations apply regularization to select a subset of the crawled web pages for indexing based on link relationships between the crawled web pages, features extracted from the crawled web pages, and user behavior information determined for at least some of the crawled web pages. Further, in some implementations, the user behavior information may be used to sort a training set of crawled web pages into a plurality of labeled groups. The labeled groups may be represented in a directed graph that indicates relative priorities for being selected for indexing.

    摘要翻译: 一些实现提供用于选择包括在索引中的网页的技术。 例如,一些实现应用正则化来基于被爬网的网页之间的链接关系,从被爬网的网页提取的特征以及为至少一些被爬网的网页确定的用户行为信息来选择用于索引的被爬网网页的子集 。 此外,在一些实现中,可以使用用户行为信息来将爬网网页的训练集合分类成多个标记的组。 标记的组可以在有向图中表示,其指示被选择用于索引的相对优先级。

    Online Advertisement Perception Prediction
    2.
    发明申请
    Online Advertisement Perception Prediction 审中-公开
    在线广告感知预测

    公开(公告)号:US20130097011A1

    公开(公告)日:2013-04-18

    申请号:US13273924

    申请日:2011-10-14

    IPC分类号: G06Q30/02

    CPC分类号: G06Q30/02

    摘要: An advertisement perception predictor may forecast the effectiveness of an online advertisement in a web page by predicting whether the online advertisement may be perceived by a consumer. The advertisement perception predictor may use a perception model that is trained for determining perception probability values of online advertisements. The perception model may be applied to an online advertisement to determine a perception probability value for the online advertisement. The perception probability value may indicate the likelihood that a consumer is likely to view the online advertisement.

    摘要翻译: 广告感知预测器可以通过预测在线广告是否可被消费者感知来预测网页中的在线广告的有效性。 广告感知预测器可以使用被训练用于确定在线广告的感知概率值的感知模型。 感知模型可以应用于在线广告以确定在线广告的感知概率值。 感知概率值可以指示消费者可能查看在线广告的可能性。

    Search Engine Menu-based Advertising
    3.
    发明申请
    Search Engine Menu-based Advertising 审中-公开
    搜索引擎基于菜单的广告

    公开(公告)号:US20130173398A1

    公开(公告)日:2013-07-04

    申请号:US13340195

    申请日:2011-12-29

    IPC分类号: G06Q30/02

    CPC分类号: G06Q30/0256

    摘要: Implementations for providing menu-based advertising are disclosed. A search engine front-end determines non-search engine information pages that are relevant to the user input based on user input entered into a search query field on a search page. A suggestion menu is caused to be displayed on a search page. The suggestion menu includes interactive elements that are interactive to cause a client device to retrieve the non-search engine information pages associated with the interactive elements. The interactive elements may be advertisements, and the suggestion menu may also be used to display search query suggestions.

    摘要翻译: 公开了提供基于菜单的广告的实现。 搜索引擎前端基于输入到搜索页面上的搜索查询字段中的用户输入来确定与用户输入相关的非搜索引擎信息页面。 导致建议菜单显示在搜索页面上。 建议菜单包括交互式的交互式元素,以使客户端设备检索与交互元素相关联的非搜索引擎信息页面。 交互元素可以是广告,并且建议菜单也可以用于显示搜索查询建议。

    Data caching for distributed execution computing
    4.
    发明授权
    Data caching for distributed execution computing 有权
    用于分布式执行计算的数据缓存

    公开(公告)号:US08229968B2

    公开(公告)日:2012-07-24

    申请号:US12055777

    申请日:2008-03-26

    IPC分类号: G06F7/00 G06F17/30

    摘要: Embodiments for caching and accessing Directed Acyclic Graph (DAG) data to and from a computing device of a DAG distributed execution engine during the processing of an iterative algorithm. In accordance with one embodiment, a method includes processing a first subgraph of the plurality of subgraphs from the distributed storage system in the computing device. The first subgraph being processed with associated input values in the computing device to generate first output values in an iteration. The method further includes storing a second subgraph in a cache of the device. The second subgraph being a duplicate of the first subgraph. Moreover, the method also includes processing the second subgraph with the first output values to generate second output values if the device is to process the first subgraph in each of one or more subsequent iterations.

    摘要翻译: 用于在迭代算法的处理期间向DAG分布式执行引擎的计算设备缓存和访问定向非循环图(DAG)数据的实施例。 根据一个实施例,一种方法包括从计算设备中的分布式存储系统处理多个子图的第一子图。 在计算设备中用相关联的输入值处理第一子图,以在迭代中生成第一输出值。 该方法还包括将第二子图存储在设备的高速缓存中。 第二个子图是第一个子图的副本。 此外,该方法还包括用第一输出值处理第二子图以产生第二输出值,如果该设备要在一个或多个后续迭代中的每一个中处理第一子图。

    User information needs based data selection
    5.
    发明授权
    User information needs based data selection 有权
    用户信息需要数据选择

    公开(公告)号:US09589056B2

    公开(公告)日:2017-03-07

    申请号:US13080510

    申请日:2011-04-05

    IPC分类号: G06F17/30

    摘要: Techniques for determining user information needs and selecting data based on user information needs are described herein. The present disclosure describes extracting topics of interests to users from multiple sources including search log data and social network website, and assigns a budget to each topic to stipulate the quota of data to be selected for each topic. The present disclosure also describes calculating similarities between gathered data and the topics, and selecting top related data with each topic subject to limit of the budget. A search engine may use the techniques described here to select data for its index.

    摘要翻译: 本文描述了用于确定用户信息需求和基于用户信息需求选择数据的技术。 本公开内容描述了从多个源(包括搜索日志数据和社交网站)向用户提取兴趣的主题,并且为每个主题分配预算以规定要为每个主题选择的数据的配额。 本公开还描述了计算所收集的数据和主题之间的相似性,并且根据预算的限制来选择与每个主题相关的顶部相关数据。 搜索引擎可以使用这里描述的技术来选择其索引的数据。

    EFFICIENT QUERY CLUSTERING USING MULTI-PARTITE GRAPHS
    6.
    发明申请
    EFFICIENT QUERY CLUSTERING USING MULTI-PARTITE GRAPHS 有权
    使用多分辨率图像进行有效的查询聚类

    公开(公告)号:US20120259850A1

    公开(公告)日:2012-10-11

    申请号:US13083353

    申请日:2011-04-08

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30705 G06F17/30864

    摘要: Efficient search query clustering using tripartite graphs may enable a search engine developer to model information needs of users while expending less computing resources. The efficient clustering of search queries may involve multiple computing devices receiving a subgraph of a multi-partite graph that encompasses search queries, as well as receiving a global center vector table that includes cluster center entries for query clusters. At each computing device, the received global center vector table may be filtered to eliminate one or more cluster center entries that are irrelevant to the search queries. Subsequently, the search queries may be clustered into the query clusters by at least using the filtered global center vector table at each of the computing devices. In some instances, one or more comparisons between search queries and the cluster center entries in the global center vector table during the clustering may be eliminated.

    摘要翻译: 使用三方图的有效的搜索查询集群可以使搜索引擎开发人员能够模拟用户的信息需求,同时减少计算资源。 搜索查询的有效聚类可以涉及多个计算设备,其接收包含搜索查询的多分图的子图,以及接收包括用于查询簇的聚类中心条目的全局中心向量表。 在每个计算设备处,可以对接收到的全局中心向量表进行过滤以消除与搜索查询无关的一个或多个聚类中心条目。 随后,搜索查询可以通过至少使用每个计算设备处的经过滤的全局中心向量表来聚集到查询群集中。 在某些情况下,可以消除在聚类期间在全局中心向量表中的搜索查询与群集中心条目之间的一个或多个比较。

    Graph-Processing Techniques for a MapReduce Engine
    7.
    发明申请
    Graph-Processing Techniques for a MapReduce Engine 有权
    MapReduce引擎的图形处理技术

    公开(公告)号:US20110295855A1

    公开(公告)日:2011-12-01

    申请号:US12790942

    申请日:2010-05-31

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30584

    摘要: Systems, methods, and devices for sorting and processing various types of graph data are described herein. Partitioning graph data into master data and associated slave data allows for sorting of the graph data by sorting the master data. In another embodiment, promoting a data bucket having a first data bucket size to a data bucket having a second data bucket size greater than the first data bucket size upon reaching a memory limit allows for the reduction of temporary files output by the data bucket.

    摘要翻译: 这里描述了用于排序和处理各种类型的图形数据的系统,方法和装置。 将图形数据分割为主数据和关联的从属数据允许通过排序主数据对图形数据进行排序。 在另一个实施例中,在达到存储器限制时,将具有第一数据桶大小的数据桶推送到具有大于第一数据桶大小的第二数据桶大小的数据桶允许减少由数据桶输出的临时文件。

    Efficient query clustering using multi-partite graphs
    8.
    发明授权
    Efficient query clustering using multi-partite graphs 有权
    使用多分图的有效查询群集

    公开(公告)号:US08423547B2

    公开(公告)日:2013-04-16

    申请号:US13083353

    申请日:2011-04-08

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30705 G06F17/30864

    摘要: Efficient search query clustering using tripartite graphs may enable a search engine developer to model information needs of users while expending less computing resources. The efficient clustering of search queries may involve multiple computing devices receiving a subgraph of a multi-partite graph that encompasses search queries, as well as receiving a global center vector table that includes cluster center entries for query clusters. At each computing device, the received global center vector table may be filtered to eliminate one or more cluster center entries that are irrelevant to the search queries. Subsequently, the search queries may be clustered into the query clusters by at least using the filtered global center vector table at each of the computing devices. In some instances, one or more comparisons between search queries and the cluster center entries in the global center vector table during the clustering may be eliminated.

    摘要翻译: 使用三方图的有效的搜索查询集群可以使搜索引擎开发人员能够模拟用户的信息需求,同时减少计算资源。 搜索查询的有效聚类可以涉及多个计算设备,其接收包含搜索查询的多分图的子图,以及接收包括用于查询簇的聚类中心条目的全局中心向量表。 在每个计算设备处,可以对接收到的全局中心向量表进行过滤以消除与搜索查询无关的一个或多个聚类中心条目。 随后,搜索查询可以通过至少使用每个计算设备处的经过滤的全局中心向量表来聚集到查询群集中。 在某些情况下,可以消除在聚类期间在全局中心向量表中的搜索查询与群集中心条目之间的一个或多个比较。

    Presenting Targeted Social Advertisements
    9.
    发明申请
    Presenting Targeted Social Advertisements 审中-公开
    提出有针对性的社会广告

    公开(公告)号:US20130091013A1

    公开(公告)日:2013-04-11

    申请号:US13268078

    申请日:2011-10-07

    IPC分类号: G06Q30/02

    CPC分类号: G06Q30/0241

    摘要: Techniques for providing targeted social advertisements in a social network are described. A targeted social advertisement application detects a commercial intent of a user and retrieves input from friends in the social network. In an implementation, a user interface includes a pane to display a comment with the commercial intent submitted by the user in the social network, the commercial intent being detected for a potential product. The user interface also includes a voting pane to display a plurality of candidate products targeted towards the commercial intent of the user for the potential product. One or more command buttons are on the voting pane to prompt voting as recommendations for the plurality of candidate products from friends of the user.

    摘要翻译: 描述了在社交网络中提供目标社交广告的技术。 目标社交广告应用程序检测用户的商业意图并从社交网络中的朋友检索输入。 在实现中,用户界面包括用于在社交网络中呈现由用户提交的商业意图的评论的窗格,为潜在产品检测到商业意图。 用户界面还包括投票窗格,以显示针对潜在产品的用户的商业意图的多个候选产品。 一个或多个命令按钮位于投票窗格上,以提示投票作为来自用户的朋友的多个候选产品的建议。

    MULTI-LEVEL COVERAGE FOR CRAWLING SELECTION
    10.
    发明申请
    MULTI-LEVEL COVERAGE FOR CRAWLING SELECTION 审中-公开
    多层次搜索选择

    公开(公告)号:US20120143844A1

    公开(公告)日:2012-06-07

    申请号:US12958611

    申请日:2010-12-02

    IPC分类号: G06F17/30

    CPC分类号: G06F16/951

    摘要: Some implementations provide techniques for determining which URLs to select for crawling from a pool of URLs. For example, the selection of URLs for crawling may be made based on maintaining a high coverage of the known URLs and/or high discoverability of the World Wide Web. Some implementations provide a multi-level coverage strategy for crawling selection. Further, some implementations provide techniques for discovering unseen URLs.

    摘要翻译: 一些实现提供了用于确定哪些URL被选择用于从URL池中进行爬网的技术。 例如,可以基于保持已知URL的高覆盖率和/或万维网的高可发现性来进行用于爬网的URL的选择。 一些实现提供了用于爬网选择的多级覆盖策略。 此外,一些实现提供用于发现不可见URL的技术。