Active prediction of diverse search intent based upon user browsing behavior

    公开(公告)号:US10204163B2

    公开(公告)日:2019-02-12

    申请号:US12762423

    申请日:2010-04-19

    Inventor: Bin Gao Tie-Yan Liu

    Abstract: Many search engines attempt to understand and predict a user's search intent after the submission of search queries. Predicting search intent allows search engines to tailor search results to particular information needs of the user. Unfortunately, current techniques passively predict search intent after a query is submitted. Accordingly, one or more systems and/or techniques for actively predicting search intent from user browsing behavior data are disclosed herein. For example, search patterns of a user browsing a web page and shortly thereafter performing a query may be extracted from user browsing behavior. Queries within the search patterns may be ranked based upon a search trigger likelihood that content of the web page motivated the user to perform the query. In this way, query suggestions having a high search trigger likelihood and a diverse range of topics may be generated and/or presented to users of the web page.

    Task-Based Advertisement Delivery
    42.
    发明申请
    Task-Based Advertisement Delivery 审中-公开
    基于任务的广告传送

    公开(公告)号:US20130097027A1

    公开(公告)日:2013-04-18

    申请号:US13272844

    申请日:2011-10-13

    CPC classification number: G06Q30/02

    Abstract: A task guidance tool that displays instructional steps and associated advertisements may facilitate the accomplishment of a task by users who are otherwise unfamiliar with the task. The task guidance tool may be developed from input data mined from various sources. The task guidance tool may display a series of step pages in which each step page include instructions for accomplishing a corresponding step of the task. Further, one or more step pages of the task guidance tool may be provided with selected advertisements that are displayed with the step instructions.

    Abstract translation: 显示教学步骤和相关联广告的任务指导工具可以促进由不熟悉任务的用户完成任务。 任务指导工具可以从从各种来源挖掘的输入数据中开发。 任务指导工具可以显示一系列步骤页面,其中每个步骤页面包括用于完成任务的相应步骤的指令。 此外,可以为任务指导工具的一个或多个步骤页面提供与步骤指令一起显示的所选择的广告。

    PAGE SELECTION FOR INDEXING
    43.
    发明申请
    PAGE SELECTION FOR INDEXING 有权
    页面选择索引

    公开(公告)号:US20120143792A1

    公开(公告)日:2012-06-07

    申请号:US12959060

    申请日:2010-12-02

    CPC classification number: G06F17/30873 G06F17/30867

    Abstract: Some implementations provide techniques for selecting web pages for inclusion in an index. For example, some implementations apply regularization to select a subset of the crawled web pages for indexing based on link relationships between the crawled web pages, features extracted from the crawled web pages, and user behavior information determined for at least some of the crawled web pages. Further, in some implementations, the user behavior information may be used to sort a training set of crawled web pages into a plurality of labeled groups. The labeled groups may be represented in a directed graph that indicates relative priorities for being selected for indexing.

    Abstract translation: 一些实现提供用于选择包括在索引中的网页的技术。 例如,一些实现应用正则化来基于被爬网的网页之间的链接关系,从被爬网的网页提取的特征以及为至少一些被爬网的网页确定的用户行为信息来选择用于索引的被爬网网页的子集 。 此外,在一些实现中,可以使用用户行为信息来将爬网网页的训练集合分类成多个标记的组。 标记的组可以在有向图中表示,其指示被选择用于索引的相对优先级。

    Semi-Supervised Page Importance Ranking
    44.
    发明申请
    Semi-Supervised Page Importance Ranking 审中-公开
    半监督页面重要性排名

    公开(公告)号:US20110295845A1

    公开(公告)日:2011-12-01

    申请号:US12789278

    申请日:2010-05-27

    CPC classification number: G06F16/951

    Abstract: Importance ranking of web pages is performed by defining a graph-based regularization term based on document features, edge features, and a web graph of a plurality of web pages, and deriving a loss term based on human feedback data. The graph-based regularization term and the loss term are combined to obtain a global objective function. The global objective function is optimized to obtain parameters for the document features and edge features and to produce static rank scores for the plurality of web pages. Further, the plurality of web pages is ordered based on the static rank scores.

    Abstract translation: 通过基于文档特征,边缘特征和多个网页的网络图定义基于图形的正则化术语,并且基于人类反馈数据导出丢失项来执行网页的重要性排名。 基于图形的正则化项和损失项被组合以获得全局目标函数。 优化全局目标函数以获得文档特征和边缘特征的参数,并且为多个网页产生静态等级分数。 此外,基于静态等级分数来排序多个网页。

    Spectral clustering using sequential matrix compression
    45.
    发明授权
    Spectral clustering using sequential matrix compression 失效
    使用顺序矩阵压缩的光谱聚类

    公开(公告)号:US07974977B2

    公开(公告)日:2011-07-05

    申请号:US11743942

    申请日:2007-05-03

    CPC classification number: G06K9/6224 G06F17/3071

    Abstract: A clustering system generates an original Laplacian matrix representing objects and their relationships. The clustering system initially applies an eigenvalue decomposition solver to the original Laplacian matrix for a number of iterations. The clustering system then identifies the elements of the resultant eigenvector that are stable. The clustering system then aggregates the elements of the original Laplacian matrix corresponding to the identified stable elements and forms a new Laplacian matrix that is a compressed form of the original Laplacian matrix. The clustering system repeats the applying of the eigenvalue decomposition solver and the generating of new compressed Laplacian matrices until the new Laplacian matrix is small enough so that a final solution can be generated in a reasonable amount of time.

    Abstract translation: 聚类系统生成表示对象及其关系的原始拉普拉斯矩阵。 聚类系统首先将特征值分解求解器应用于原始拉普拉斯矩阵进行多次迭代。 然后,聚类系统识别所得到的特征向量的元素是稳定的。 然后,聚类系统聚合对应于所识别的稳定元素的原始拉普拉斯矩阵的元素,并形成作为原始拉普拉斯矩阵的压缩形式的新的拉普拉斯矩阵。 聚类系统重复应用特征值分解求解器和生成新的压缩拉普拉斯矩阵,直到新的拉普拉斯矩阵足够小,以便在合理的时间内生成最终解。

    ANTI-SPAM TOOL FOR BROWSER
    46.
    发明申请
    ANTI-SPAM TOOL FOR BROWSER 有权
    反垃圾邮件工具

    公开(公告)号:US20090216868A1

    公开(公告)日:2009-08-27

    申请号:US12035124

    申请日:2008-02-21

    CPC classification number: G06F17/30899 G06F21/50

    Abstract: An anti-spam tool works with a web browser to detect spam webpages locally on a client machine. The anti-spam tool can be implemented either as a plug-in module or an integral part of the browser, and manifested as a toolbar. The tool can perform an anti-spam action whenever a webpage is accessed through the browser, and does not require direct involvement of a search engine. A spam detection module installed on the computing device determines whether a webpage being accessed or whether a link contained in the webpage being accessed is spam, by comparing the URL of the webpage or the link with a spam list. The spam list can be downloaded from a remote search engine server, stored locally and updated from time to time. A two-level indexing technique is also introduced to improve the efficiency of the anti-spam tool's use of the spam list.

    Abstract translation: 反垃圾邮件工具与网络浏览器配合使用,可以在客户机上本地检测垃圾邮件网页。 反垃圾邮件工具可以作为插件模块或浏览器的组成部分来实现,并且表现为工具栏。 每当通过浏览器访问网页时,该工具都可以执行反垃圾邮件操作,并且不需要直接参与搜索引擎。 安装在计算设备上的垃圾邮件检测模块通过将网页或链接的URL与垃圾邮件列表进行比较来确定正在访问的网页是否被访问的网页中包含的链接是垃圾邮件。 垃圾邮件列表可以从远程搜索引擎服务器下载,本地存储和不时更新。 还引入了两级索引技术,以提高反垃圾邮件工具使用垃圾邮件列表的效率。

    RANKING DOCUMENTS BASED ON A SERIES OF DOCUMENT GRAPHS
    47.
    发明申请
    RANKING DOCUMENTS BASED ON A SERIES OF DOCUMENT GRAPHS 有权
    基于一系列文件图表的排序文件

    公开(公告)号:US20080313168A1

    公开(公告)日:2008-12-18

    申请号:US11764554

    申请日:2007-06-18

    CPC classification number: G06F17/30864

    Abstract: Ranking documents based on a series of web graphs collected over time is provided. A ranking system provides multiple transition probability distributions representing different snapshots or times. Each transition probability distribution represents a probability of transitioning from one document to another document within a collection of documents using a link of the document. The ranking system determines a stationary probability distribution for each snapshot based on the transition probability distributions for that snapshot and the stationary probability distribution of the previous snapshot. The stationary probability distributions represent a ranking of the documents over time.

    Abstract translation: 提供了基于随时间收集的一系列网络图表排列文档。 排名系统提供代表不同快照或时间的多个转移概率分布。 每个转移概率分布表示使用文档的链接在一个文档集合内从一个文档转换到另一个文档的概率。 排名系统基于该快照的转移概率分布和先前快照的固定概率分布确定每个快照的固定概率分布。 固定概率分布代表文档随时间的排列。

    Co-clustering objects of heterogeneous types
    48.
    发明授权
    Co-clustering objects of heterogeneous types 有权
    异构类型的聚类对象

    公开(公告)号:US07461073B2

    公开(公告)日:2008-12-02

    申请号:US11354208

    申请日:2006-02-14

    Abstract: A method and system for high-order co-clustering of objects of heterogeneous types using multiple bipartite graphs is provided. A clustering system represents relationships between objects of a first type and objects of a third type as a first bipartite graph and relationships between objects of a second type and objects of the third type as a second bipartite graph. The clustering system defines an objective function that specifies an objective of the clustering process that combines an objective for the first bipartite graph and an objective for the second bipartite graph. The clustering system solves the objective function and then applies a clustering algorithm such as the K-means algorithm to the solution to identify the clusters of heterogeneous objects.

    Abstract translation: 提供了使用多个二分图的异构类型的对象的高阶共聚的方法和系统。 聚类系统表示第一类型的对象与第三类型的对象之间的关系,作为第二个二分图,第二类的对象与第三类的对象之间的关系作为第二个二分图。 聚类系统定义了一个目标函数,该目标函数指定了组合第一个二分图的目标和第二个二分图的目标的聚类过程的目标。 聚类系统解决了目标函数,然后将K-means算法的聚类算法应用于解决方案,以识别异构对象的簇。

    CALCULATING IMPORTANCE OF DOCUMENTS FACTORING HISTORICAL IMPORTANCE
    49.
    发明申请
    CALCULATING IMPORTANCE OF DOCUMENTS FACTORING HISTORICAL IMPORTANCE 有权
    计算历史重要性文件的重要性

    公开(公告)号:US20080256051A1

    公开(公告)日:2008-10-16

    申请号:US11734336

    申请日:2007-04-12

    CPC classification number: G06F17/30882 G06F17/30864

    Abstract: A method and system for determining temporal importance of documents having links between documents based on a temporal analysis of the links is provided. A temporal ranking system collects link information or snapshots indicating the links between documents at various snapshot times. The temporal ranking system calculates a current temporal importance of a document by factoring in the current importance of the document derived from the current snapshot (i.e., with the latest snapshot time) and the historical importance of the document derived from the past snapshots. To calculate the current temporal importance of a web page, the temporal ranking system aggregates the importance of the web page for each snapshot.

    Abstract translation: 提供了一种用于基于链接的时间分析来确定具有文档之间的链接的文档的时间重要性的方法和系统。 时间排序系统收集指示各种快照时间的文档之间的链接的链接信息或快照。 时间排序系统通过考虑从当前快照(即,具有最新快照时间)导出的文档的当前重要性以及从过去快照导出的文档的历史重要性来计算文档的当前时间重要性。 为了计算网页的当前时间重要性,时间排序系统聚合每个快照的网页的重要性。

    Co-clustering objects of heterogeneous types
    50.
    发明申请
    Co-clustering objects of heterogeneous types 有权
    异构类型的聚类对象

    公开(公告)号:US20070192350A1

    公开(公告)日:2007-08-16

    申请号:US11354208

    申请日:2006-02-14

    Abstract: A method and system for high-order co-clustering of objects of heterogeneous types using multiple bipartite graphs is provided. A clustering system represents relationships between objects of a first type and objects of a third type as a first bipartite graph and relationships between objects of a second type and objects of the third type as a second bipartite graph. The clustering system defines an objective function that specifies an objective of the clustering process that combines an objective for the first bipartite graph and an objective for the second bipartite graph. The clustering system solves the objective function and then applies a clustering algorithm such as the K-means algorithm to the solution to identify the clusters of heterogeneous objects.

    Abstract translation: 提供了使用多个二分图的异构类型的对象的高阶共聚的方法和系统。 聚类系统表示第一类型的对象与第三类型的对象之间的关系,作为第二个二分图,第二类的对象与第三类的对象之间的关系作为第二个二分图。 聚类系统定义了一个目标函数,该目标函数指定了组合第一个二分图的目标和第二个二分图的目标的聚类过程的目标。 聚类系统解决了目标函数,然后将K-means算法的聚类算法应用于解决方案,以识别异构对象的簇。

Patent Agency Ranking